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Information Management System: 


Interactive Information Management 
Systems 


By D. T. CHAI and J. M. WIER 
(Manuscript received October 5, 1972) 


This paper and the following three describe computer systems to store, 
retrieve, and manipulate information. These have all utilized time-shared 
computer systems. All have evolved toward a system constructed of modular 
component parts and having a high degree of user interaction. Consider- 
able attention has been given to implementation in a form suitable for 
simple transfer to systems of adequate capability with minimal pro- 
gramming effort. The data bases involved are all hierarchical in organi- 
zation. The major parts are a language facility, a data base manager, a 
processing package, and numerous coordinated administration functions. 
The parts are currently assembled into a package which can be applied to 
an arbitrary hierarchically structured data base with little user effort. 
The component parts are also available for integration into more tailored 
systems for special applications. 


I. INTRODUCTION 
This paper and the three that follow it discuss various aspects of 
the problem of using computers to store, retrieve, and manipulate 
1681 
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information. In particular they describe computer systems for carry- 
ing out important parts of such work. These parts have been integrated 
into a system for handling information. The system described in these 
papers has been designed so that a user and the computer system can 
interact heavily in reaching the solution to a problem posed by the user. 

Systems generically related to the ones described here have appeared 
in great numbers in the past decade.'~® In general they all use a com- 
puter to store, process, and provide results from information contained 
in a “data base”’ controlled by the computer. However, this deceptively 
simple description hides the many differences between the systems 
which make them less generally applicable than would seem immedi- 
ately evident. No attempt will be made in the following to be complete 
in categorizing such systems. However, enough information will be 
given to place the present work in perspective with respect to im- 
portant requirements placed on such systems in various applications. 

To circumscribe the work reported here and its potential field of 
application, let us characterize information systems according to the 
properties indicated in Table I. 

The systems which have been implemented using the tools reported 
here generally are most useful in applications corresponding to the 
earlier-given of the choices in the various categories. The amount of 
information contained in the data bases served is generally less than 
50,000,000 characters. The information is heavily structured into a 
hierarchical format. The users are typically not highly skilled in the use 
of computers. A typical request placed on the system will require fewer 
than ten seconds of processing. Finally, the user will always expect an 
answer in less than ten minutes, often in less than one minute, and 
occasionally in less than ten seconds. 

These figures are dictated by the uses to which the systems are 
usually put, tempered by economic and computer limitations. Rela- 
tively small packets of information are supplied to the system in any 
one transaction. Further, requests to provide information and pro- 
cessing are simple since the user employs on-line composition of re- 
quests and interpretation of the results delivered. 

The properties implied by this method of interaction cause the re- 
sulting system to be somewhat specialized in order to carry out such 
operations to the satisfaction of the potential users. The following are 
a few cases where the decision to handle processing in the manner indi- 
cated may adversely affect the applicability of the system to other uses. 

In order that response time to a given request be short, the system 
tailors its operations to deal with a spectrum of requests assumed 
known at the time of system origination. Thus, requests for large 
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TABLE I—CHARACTERIZATION OF INFORMATION SYSTEMS 


Amount of Information: 
Up to 100,000 characters 
100,000 characters to 50,000,000 characters 
50,000,000 characters and up 


Structure of Information: 
Hierarchical 
Network 
List 


Users: 
Non-computer skilled 
Computer skilled 


Size of Transaction: 
Less than 10 seconds of processing 
Greater than 10 seconds of processing 


Time Scale: 
Less than 1 minute 
1 minute to 10 minutes 
Greater than 10 minutes 





amounts of output, complex or lengthy processing, or data stored in 
some order much different from that assumed may result in poor ser- 
vice. Specifically, mass business data processing is frequently not well 
handled in this way. 

Since the system is designed to serve an interactive user as well as is 
feasible, the data base may be more difficult to update or set up in the 
first place than one specifically designed to be processed as a whole. 
In the same vein, restart procedures are generally more difficult to 
incorporate as such operations take time and thus cause poorer time 
response. 

The decision to utilize a hierarchically structured data base means 
that other organizations will be unavailable, except as they can be 
mapped onto a hierarchy. 

The concentration on serving users who are perhaps not skilled in the 
use of computers limits the complexity of potential operations. 

The exact degree of difficulty for other applications caused by each 
of these choices varies. The positive benefits obtained have been 
adjudged sufficient rewards in the thriving areas where the system to 
be described is used. 


II. SYSTEM DESCRIPTION 


All of the elements of the total system to be described in these papers 
have been implemented on a time-shared computer system. The com- 
puter system thus takes care of many of the details involved in serving 
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many users. Some of the more obvious and important of these are: 


(z) Provision of an interface to a communication facility. 
(22) Provision for separating users into categories and keeping them 
apart. 
(212) Provision of a flexible charging structure. 
(zv) Provision for physical storage allocation. 


The parts of the information management system are assembled in 
a modular fashion. Between each of them is a well-defined interface 
for exchanging information. The components are put together as shown 
in Fig. 1. 

In this figure the users are shown impinging on the system at the 
left. This contact takes place via the switched telephone network. One 
or more users can be connected to the information system described 
at any time. Each user interfaces with the Natural Dialogue System 
(NDS). The Natural Dialogue System is described more fully by 
Puerling and Roberto.’ It provides the ability to carry on a relatively 
simple interactive pseudo-English conversation with the user in order 
to ascertain his needs. 

When an adequate amount of information is available to define the 
user service request, the Natural Dialogue System passes information 
sufficient to define the request to a processor. The processor chosen is 
determined by the user-NDS dialogue. The processor then uses its 
input data to make calls on Master Links (ML) to provide specified 
information from the associated data base or send some to it. Master 
Links, using facilities described by Gibson and Stockhausen,® carries 
out the operations required on the data base and returns the data 
needed. The chosen processor then formats the response and sends it 
to the user. The whole sequence may be reinstituted by the user by 
placing a new request before the system or the user may actively (by 
signing off) or passively (by hanging up) abandon his quest. 

In this system the processors are one of two types: 


(tz) Job-specific ones that have been specially programmed for an 
application. 

(77) General-purpose ones that have been found to be useful in 
numerous applications and that thus are provided to all users. 


In addition to these elements there exist a number of auxiliary 
capabilities which are necessary to the smooth and complete operation 
of such systems. These capabilities are provided by numerous pro- 
gramming packages. They, among other tasks, take care of loading 
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PROCESSORS DATA BASE 


Fig. 1—Components of an information management system. 
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FACILITY 
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(ML) 





bulk data, checking its validity, auditing the system for efficiency and 
completeness, rearranging the data into a different order, and taking 
statistics on usage. These will not be described. 


II. INTERACTIVE INFORMATION PROCESSING 


As was mentioned in the introduction, the systems described in this 
series of papers concentrate on the provision of a highly interactive 
contact between the user and the information management system. 
The importance of this type of interaction was dictated by the applica- 
tions which led to the design. This section discusses some of the con- 
siderations leading to the specific design decisions made. 

In a very practical sense, an understanding of the system is not 
possible without examining the environment in which it works. The 
job of solving problems involving a data base contained in an inter- 
active information system is jointly shared by the user and the system. 
Each does what “‘he’’ can do best. 

The information system takes care of data storage, data processing, 
and information display in addition to a number of housekeeping 
chores. The user brings in the problem, formulates the solution in the 
form of a sequence of requests placed before the system, and guides the 
work of the information system as it progresses. 

These operations would appear to be identical with those carried out 
in classically programmed data processing. The user (a programmer) 
translates a problem into a sequence of data processing steps which the 
computer is given to carry out. There is, however, an important differ- 
ence which makes the interactive process much better for some 
applications. 

That difference stems from the fact that it is not possible to program 
a computer to provide some solution unless an algorithm exists for 
doing it. When working with complex data bases, it is frequently 
necessary to find out a great deal about the data just to be able to 
write a suitable program. This process of “finding out about the data” 
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is frequently best done by going into the data base on some exploratory 
trips. It is here that interactive data base processing is very useful. 

The user not only can collect the data, but he can also exclude vast 
areas where it is not worth taking computer time to look. This is 
possible because he gets a “feel” for the data, the limits of its range, 
its empty spots, its peculiarities. These allow him to reduce search 
times, to try simplified models that are “apparent” from looking, and 
to avoid wasting time and effort. All of these are simple for the user to 
employ while provided with immediate response from the data base. 
They are frequently difficult to program. Recognition of patterns is 
one of man’s strong points. Generation of all possible patterns to be 
explored is not. 

In order to provide this interactive capability it is necessary to 
smooth the communication between the user and the interactive com- 
puter system. This process is not a simple one. Basically, it involves a 
smooth translation from a form which is “natural”? and unambiguous 
to the user to one which the computer can use on input. On output the 
process is reversed. 

Numerous studies have been made of the use of English as a com- 
munication medium for talking with a computer.?>:®°-" Unconstrained 
English serves this purpose poorly, not only because of implementation 
difficulties, but because of the heavy use of context and alogical con- 
structions. Even the REL® system which has progressed a long way 
toward natural language usage requires a rather disciplined approach 
to construction and meaning. Montgomery" has collected numerous 
telling examples which clearly illustrate the difficulties. These prob- 
lems have led the designers of this system to adopt a pseudo-English 
language based on independent phrases, each of which begins with a 
specified keyword. The use of keywords greatly reduces the ambiguities 
of the user request and, at the same time, reduces the parsing or 
analyzing time by the computer. The paper by Puerling and Roberto’ 
describes the keyword style of languages that is available through the 
use of the Natural Dialogue System. The paper by Heindel and Ro- 
berto” describes one implementation of a keyword language for gen- 
eral-purpose retrievals. 

The choice of accepting independent phrases in a request also 
materially simplifies another computer-user interaction process. 
Economics and the state of technology strongly recommend a key- 
board input mechanism (other choices cost too much or are not well 
developed technically). Unfortunately, typing, particularly facile 
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typing, is not a universally available skill. Thus input, for many poten- 
tial users, is clumsy and is often a source of errors. The time delays and 
annoyances in this process often put off potential users and reduce the 
value of a system. The use of a phrase-type grammar provides some 
help in the system described by reducing retyping on errors to the level 
of the phrase rather than the sentence. In actual use the quantity of 
typing is further reduced by providing editing facilities which preserve 
common material already placed in the system from interaction to 
interaction. 

A second communications barrier which can exist in an interactive 
system is that of response time. If the user is employing the system in 
an interactive way in the pursuit of a solution to his problem, he finds 
that excessive delays in delivering replies to his requests create gaps in 
the continuity of his thoughts on the solution. They distract him and, 
more seriously, they affect his ability to note patterns in the output. 
They thus reduce his effectiveness in solving the problem. They also 
bore him and waste his time, both of which reduce the probability that 
a proper and prompt solution will be forthcoming. 

Because of the effect on user acceptance and user effectiveness, the 
systems to be described have been implemented with response time a 
major criterion of merit. This criterion has shaped the system in at 
least the following ways: 


(1) The complexity of a request is reduced by making simple 
requests easier to formulate than complex ones. 

(11) The Master Links data base management system provides 
numerous tools for tailoring a data base to the requirements of 
its potential users. 

(2171) The languages are designed to reduce search time in the data 
base by simplifying the specification of data base delimiters. 

(tv) Monitors have been provided for noting the state of the data 
base and the usage by the system clientele. 

(v) Dialogue is retained from request to request to reduce the 
typing burden. 

(vt) Numerous detectors of errors are employed and extensive 
helpful (not critical) diagnostics are provided. 


IV. SOME COMMENTS ON PERFORMANCE 


As has been mentioned, the systems described have been designed 
to deliver prompt response in an interactive environment. In addition 
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to pursuing the goals just mentioned, the software has been designed 
to perform well in an absolute sense as well. To measure the perform- 
ance actually achieved, extensive unit testing has been employed. In 
traffic situations, simulations have been run on the performance in the 
presence of various levels of load. Overall tests of system performance 
have been designed and run. A model for evaluating system perform- 
ance as a function of the processing to be done in the data base has 
been developed. Such tests and models have been most helpful in 
comparing different system implementations and algorithms. The 
knowledge so gained has also been used in updating designs and 
optimizing system use. 

The systems described have been used in various applications with 
data bases containing up to a few tens of millions of characters of data. 
These have all been hierarchical in organization and generally did not 
employ more than ten levels in the hierarchy. By using the various 
tuning facilities, the time to return answers to typical requests can 
often be reduced below ten seconds. More complex ones occasionally 
run to a few tens of seconds, but these employ less commonly used 
facilities. In general, requests requiring extensive data searches are 
more time-consuming than those requiring less information. 

The key to good performance lies in matching the information 
management system to the needs of the application. In most applica- 
tions the system can be tailored to provide adequately prompt service 
for the spectrum of common requests, sometimes at the expense of 
less important functions. These latter can usually be handled, less 
expeditiously, without creating an operational problem as they occur 
less frequently. In the current state of the art, no economic solution 
has been found which does not require this compromise for the larger 
and structurally more complex data bases. In all of the latter it is 
always possible to find pathological interactions with the data base 
which force data base searches in a very poor order. 


V. SUMMARY 


The work done in designing, testing, and applying the systems 
described has indicated the following: 


(t) Interactive information management systems of acceptable 
performance are feasible and economically attractive in the 
current state of the art. 

(22) The hierarchical data base organization has been no handicap 
in providing information management in most applications 
tested. 
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(iit) It is desirable to match an information management system to 
the application in order to get prompt responses from it. 


REFERENCES 


1. Cuadra, C. A. (editor), Annual Review of Information Science and Technology, 
Pauerecience Publishers, 1966, 1967; Encyclopaedia Brittanica Co., 1968, 1969, 
1971. 
2. Senko, M. E., “Information Storage and Retrieval,” in Advances in Information 
Systems Science, 2 (ed. by J. T. Tou), Plenum Press, 1969, pp. 229-281. 
3. Salton, G., and Lesk, M. E., “The SMART Automatic Document Retrieval 
System—an Illustration,’ Comm. ACM, 8, No. 6 (June 1965), pp. 391-398. 
. Sinowitz, N. R., “DATAPLUS—A Language for Real Time Information 
Retrieval from Hierarchical Data Bases,” Proc. AFIPS, 32 (SJCC 1968), 
pp. 395-401. 
. Dostert, B. H., ‘“REL—An Information System for a Dynamic Environment,” 
REL Report 3, California Institute of Technology, December 1971. 
. Chai, D. T., “An Information Retrieval System Using Keyword Dialog,” 
Information Storage and Retrieval, 9, No. 7 (July 1973), pp. 373-387. 
. Puerling, B. W., and Roberto, J. T., “The Natural Dialogue System,” B.S.T.J., 
this issue, pp. 1725-1741. 
. Gibson, T. A., and Stockhausen, P. F., “MASTER LINKS—A Hierarchical 
Data System,”’ B.S.T.J., this issue, pp. 1691-1724. 
. Woods, W. A., ‘Procedural Semantics for Question Answering,’ Proc. AFIPS, 
83 (FJCC 1968), pp. 457-471. 
10. Kellogg, C. H., “A Natural Language Compiler for On-Line Data Management,” 
Proc. AFIPS, 33 (FJCC 1968), pp. 473-492. 
11. Montgomery, C. A., “Is Natural Language an Unnatural Query Language?’’ 
Proc. ACM, 26 (August 1972), pp. 1075-1078. 
12. Heindel, L. E., and Roberto, J. T., “The Off-The-Shelf System—A Packaged 
Information Management System,” B.S.T.J., this issue, pp. 1743-1763. 


i 


Oo Oo NN DD So 


Copyright © 1973 American Telephone and Telegraph Company 
Tue Bey SystEM TECHNICAL JOURNAL 
Vol. 52, No. 10, December, 1973 
Printed in U.S.A. 


Information Management System: 


MASTER LINKS—A Hierarchical 
Data System 


By T. A. GIBSON and P. F. STOCKHAUSEN 
(Manuscript received October 5, 1972) 


MASTER LINKS is a software system used to build, administer, and 
access hierarchical data bases. It 1s designed to operate in a tume-sharing 
environment, and, in particular, it allows multiple concurrent updates 
and retrievals on the same data base. 

A BUILD module is used to specify the hierarchical configuration of a 
data base and an initial ‘‘storage mapping’’ of the elements of the hier- 
archy into a particular file layout. A set of administrative routines is 
provided for altering the mapping and other such maintenance purposes. 
The access routines have three levels of interface, from primitive and 
flexible to sophisticated and functional. The interfaces are all defined in 
terms of the hierarchical structure and independent of the storage mapping. 
Thus, an alteration of the storage mapping for a data base does not 
require changing any programs that access data using these interfaces. 

The lowest-level interface enables the calling program to add to the 
data base, update a value, or retrieve a value, in terms of a hierarchy 
position. The second-level interface facilitates traversal of a hierarchy 
by enabling the calling program to specify portions of the hierarchy over 
which a process is to operate. Such a specification, called an “access tree,” 
consists of data which can be generated at execution time by the calling 
routine. As in the first level, data are transferred one at a time. The third- 
level interface is a function evaluation mechanism which computes values 
from data base values and other computed values according to function 
definitions passed to it at execution time. Like an access tree, a function 
definition is itself data which can be constructed at execution time by the 
client process. 
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I. MASTER LINKS OBJECTIVES 


The Master Links data system is a collection of software that 
accesses and manipulates data stored in a hierarchical structure on a 
computer’s secondary storage devices. It services requests from 
“client”’ programs to store and retrieve data, and to create and release 
space in the data structure. 

The Master Links project was designed with the following goals: 


(2) Provide a basic “low level’? set of access mechanisms to 
retrieve and store data items, and to create and delete branches 
of the hierarchy. Client programs using these mechanisms 
work entirely in terms of the hierarchical structure. 

(wz) Provide ‘‘high level’? access mechanisms that simplify the 
programming task for complex retrieval requests. 

(i2t) Support many concurrent users on a data base, doing both 
retrievals and updates. 

(tv) Operate well in a time-sharing environment. 

(v) Enhance portability of the system by basing its design on 
machine-independent concepts. 


Other goals are presented in the text of this paper. 

This report begins with a definition of the elements of hierarchical 
data structures, and a description of the basic access mechanisms, in 
Section II. Section III examines the requirements of typical client 
processes. Then high-level access mechanisms are described in Sections 
IV and V. Thus, these four sections describe the system as viewed 
by its users. Section VI delves into the system design and shows how 
the structures are arranged to provide these capabilities in portable 
form with high performance. The final section discusses the experience 
acquired with current implementations, and presents an outline of 
current and future developments of Master Links. 


II. ELEMENTS OF HIERARCHICAL DATA STRUCTURE 


The elements of a hierarchical data structure are entities, groups, 
and fields. Groups and fields are the permanent elements of a data base. 
They are established by a process called “building” the data base. 
Entities are the dynamic elements. They are added and deleted at any 
time by client programs using the basic access mechanisms of Master 
Links. Client programs also use the basic access mechanisms to trans- 
mit data for a field of an existing entity. 
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2.1 Entities, Groups, and Fields 


A field is a set of data all identified by the same field name. There 
are several types of fields: numerical, character, logical, and date. 

An entity is an element which holds one value for each of a given 
list of fields. We will draw an entity as a rectangle, with the field names 
to one side and the values inside, thus: 


STORE NAME PLAZA 


EARNINGS 10325 


A group is a set of entities with the same fields. A group has a name 
which indicates the nature of its entities. The name of a group will be 
written in an ellipse: 


PLAZA MAIN ST RT 46 PLAZA 
10325 69238 21420 96823 


A data base is composed of a set of groups which are hierarchically 
related. One group is the top group. All others descend in a tree fashion: 


DEPARTMENT 


This is a STORE and WAREHOUSE data base. It is subdivided by 
CITY at the top. Each city of the chain of stores is represented by 
one entity in the CITY group. There are several stores per city, several 
departments per store, and several items per department. In addition, 
certain data are kept on an annual and a monthly basis for each store. 
Each city also has zero, one, or more warehouses, and there are several 
items of STOCK per warehouse. 

Although called a tree, the structure is always drawn “upside 
down.” This is not in fact unusual. Corporation organization charts are 
frequently drawn this way, as are part lists, inventory lists, etc. It 


STORE NAME 





BHARNINGS 





1694 THE BELL SYSTEM TECHNICAL JOURNAL, DECEMBER 1973 
FIELD NAME GROUP DEPTH 


CITY NAME 


WAREHOUSE NAME 


STORE NAME 
EARNINGS 


STOCK NO 
UNITS 





YEAR NAME 


DEPT NO 
SALES FORCE 


DEPARTMENT 
DOLLAR SALES 






MONTH NAME 
NET SALES 
ADVERTISING 


ITEM NO 

IN STOCK 

ON ORDER 
BACK ORDERED 
PURCHASE COST 


Fig. 1—A group tree with fields. 


places the major components at the top and the detailed ones lower 
down. 

Figure 1 shows this data base with fields assigned to each group. 
Figure 2 shows a blowup of the entities. The ellipsis (---) in a group 
indicates several entities not shown. 

The parent of a group is the group immediately above it. The top 
group has no parent. All other groups can have only one parent. 
The parent of an entity is the entity immediately above it. Entities 
in the top group have no parent and all others have one parent. The 
parental relation of entities must parallel those of the groups. Thus 
if group B has group A for its parent : 


then all entities of B must have their parent entities in A. 


FIELD NAME 


CITY NAME 


WAREHOUSE NAME 


STORE NAME 
EARNINGS 


STOCK NO 
UNITS 


YEAR NAME 


DEPT NO 
SALES FORCE 
DOLLAR SALES 


MONTH NAME 
NET SALES 
ADVERTISING 


ITEM NO 

IN STOCK 

ON ORDER 
BACK ORDERED 
PURCHASE COST 


GROUP 
a b c 


| KANSAS CITY | TOPEKA ! LOS ANGELES | CITY 































| PLAZA | WEST | FLAT AVE | WAREHOUSE 
MAIN ST ! RT 46 1! PLAZA | STORE 
69238 21420 96823 |! 
k r 
1 107 | 108 | 437 ! 107 | 108 | STOCK 
93 | 258 | 67 | 23 | 951 |! 
| YEAR 
| 1820 | | 9542 | | 9542 ; 
7352 =| | 3824 | 1! 1680 | 
| JAN j DEC | | DEC | 
l 24 Bane 289 ‘eas l 678 | MONTH 
L168 1 68 | [94 
107 | 453 107 | 683 
21 | | 16 | | oO | 10 | 
| 12 aes 10 eae | 21 eee | 6 | ITEM 
| oO | [ e3| | 36 | | 20 | 
19 22 19 | | 63 


Fig. 2—A blowup of the entities. 
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In Fig. 2 the entities are labeled a, b, ---, r. These labels are not a 
part of the data base, but are used only as references in this paper. 

We have made use of the terms “parent of an entity” and “parent 
of a group.” This suggests the use of other genealogical terms. The 
kth ancestor of an entity is the entity k steps above it. (Hence the 
first ancestor is the parent.) The offspring of an entity are all the 
entities immediately (one step) below it. The descendants are all the 
entities below it. 

Yor each entity, all its offspring in one group form a family. Entities 
a, b, and ¢ are a family; d and e another family; g and h another family. 
Notice that entity a has two families under it, one in STORE and one 
in WAREHOUSE. If two entities are in the same family, such as d 
and e, they are szblings to another. If two entities have the same parent, 
but are in different families, such as d and g, they are step-siblings. 


2.2 Building the Data Base and Entity Dynamics 


A particular data base is established by defining the group tree and 
the fields of each group. This process is called buzlding the data base. 
The language for describing the data base is called the build language. 
Using this language a data base designer describes the permanent 
attributes of his data base and submits the description to a utility 
program called BUILD. After BUILD has processed the description, 
the data base has no entities, and no data, but only a “skeleton” 
structure. 

Entities are the dynamic components of a data base. They may be 
added or deleted online, even while other users are working on the 
data base. Thus the actual data base grows and acquires data, but 
always in accordance with the structure defined by BUILD. 


2.3 Basic Access Mechanisms 


There are five basic operations which programs can perform on the 
data structure: 


(2) Select a top entity or an entity whose parent has been pre- 
viously selected. 
(22) Add a new offspring to a selected entity or add a new entity 
to the top group. 
(z77) Delete a selected entity. 
(tv) Select a field. 
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(v) Transmit data to or from a selected entity for the selected 
field. 


These five basic operations make possible any manipulation of the 
data structure except modification of the permanent attributes of the 
data base established by BUILD. 


2.4 Identifications 


A user (or a program) accessing the data base must be able to 
uniquely identify each element. Users identify elements by names, 
such as ‘KANSAS CITY,’ ‘WAREHOUSE,’ or ‘EARNINGS.’ Names 
are also called external identifiers, because they are used (by people) 
external to the software. The Master Links software uses internal 
identifiers, which are integers such as group 2, entity 7, field 13. The 
term identifier refers to both internal and external identifiers. 

Fields and groups are given unique identifiers. Figure 2 shows group 
and field names. These names are selected at the time the data base is 
built and then do not change. Their internal identifiers are positive 
integers assigned by BUILD. 

The identification of entities must be done somewhat differently, 
since they are not established by BUILD. The internal identifier of 
an entity is a positive integer called the entity index. The first entity 
of a family has index 1, the second has index 2, etc. Thus, the internal 
identifier is unique only within a family. 

This method of identifying entities allows implicit associations to be 
established among the entities of a group. The most common use of 
this is to assign the same index to all the entities which have some 
attribute in common. In this case the external identifier names that 
attribute. For example, a data base with juss MONTH and STORE 
groups is shown with internal and external identifiers. 


1 2 3 MONTH 
JAN FEB MAR 


i LN gro 


1 2 1 2 1 
PLAZA MAINST PLAZA MAINST PLAZA MAIN ST 
All store entities with index 1 have the common attribute of storing 


data for the PLAZA store. One can request processing of all data 
for the PLAZA store, and this will cause all STORE entities with 
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index 1 to be processed. To uniquely identify a single entity in the store 
group requires specifying both a month and a store identifier. 

Another implicit association possible is the ordering of entities. 
Again using months as an example, JAN through DEC can be assigned 
indexes 1 through 12 respectively. 


II. THE CLIENT PROCESS 


Processes that access data bases have a strong tendency to access 
either many fields from a few entities, or few fields from many entities. 
An example of the first type is 


ENTER NEW SHIPMENT DATA FOR WAREHOUSE 


Values for many fields are to be put into one warehouse entity. For 
this type of request the basic access mechanisms are quite convenient. 
An example of the second type of request is 


FOR ALL STORES IN CITIES ___, ___, AND ___, PRINT 
STORE NAME, AND 1972 NET SALES PER STORE 
DIVIDED BY EARNINGS. 





Only a few fields (STORE NAME, NET SALES, and EARNINGS) 
are required, but a large number of specific city, store, year, and month 
entities must be accessed to fulfill this request. Further, the values of 
NET SALES and EARNINGS that are retrieved must be functionally 
combined into the values of ‘“‘1972 NET SALES per store divided by 
EARNINGS.” It is possible to do these tasks by using the basic 
access mechanisms, but the programming is tedious and lengthy. 
Master Links provides a set of higher-level access mechanisms that 
makes programming of the above PRINT request as simple and 
straightforward as this: 


(t) Declare which entities are to be processed. 
(22) Step to each of these entities in turn, and retrieve and print a 
value for the requested function. 


The entities to be processed are declared with an access tree. The 
access tree provides directions to the generator which steps to each of 
the entities in turn. Finally, the retrieval is performed by the function 
evaluator which does all the work of evaluating functions of data 
stored in a data base. These tools for client programmers are described 
in the next two sections. 
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IV. ACCESS TREES AND THE GENERATOR 


An access tree describes a subtree of the entities of the data base. 
Thus any entity on the access tree has all its ancestors on the access 
tree. It can also be visualized as a “‘pruned” entity tree: when an 
entity is removed, so are all its descendants. Several concepts underlie 
the mechanism for building an access tree: 


(2) The generated group 

(iz) The refined inclusion of an entity 
(iz) The refined set of entities 

(iv) Independently refined sets 

(v) Whole-family inclusion. 


These are described in turn. The data base of Fig. 2 is used for all 
examples. 


4.1 The Generated Group 


Some groups of the data base will contain data needed by the process, 
and some will not. Those that contain needed data, and all their 
ancestors, are the generated groups. The rest of the groups have no 
entities on the access tree and therefore will not be generated. 

The client process may specify what groups have needed data. It 
therefore specifies by implication the generated groups and the groups 
to be pruned from the access tree. The generated groups all have 
entities on the access tree. They are there either by refined inclusion 
or by whole inclusion. 


4.2 Refined Inclusion 


Refined inclusion means an entity has been put on the access tree 
by explicitly giving its group identity and entity identity within its 
family. In Fig. 3, KANSAS CITY has been explicitly named to the 
access tree, and therefore is a case of refined inclusion. In writing 
programs, the internal identities are used: the group number and the 
entity index within its family. In our examples in this section, we will 
use external identities, as has been done in Fig. 1. It is confusing to 
wade through a lot of numerical codes in examples when trying to 
learn about concepts. 

The year entity whose name is 70 is not unique. There are several 
such entities, one for each STORE entity. They have the same ex- 
ternal identity, 70, the same internal identity, index 1, and therefore 
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REFINEMENTS 





REFINE REFINE REFINE REFINE 
GROUPS LIST 1 LIST 2 LIST 3 LIST 4 
CITY KANSAS CITY KANSAS CITY TOPEKA KANSAS CITY 
STORE PLAZA MAIN ST RT 46 
WAREHOUSE WEST 


INDEPENDENT REFINEMENTS 


GROUPS REFINE LIST 1 REFINE LIST 2 
YEAR , 71 72 
MONTH - DEC JAN 


GENERATED GROUPS 
CITY, STORE, YEAR, MONTH, WAREHOUSE, STOCK 


RESULTING ACCESS TREE 


KANSAS CITY TOPEKA 
WEST 
PLAZA MAIN ST RT 46 
437 
71 72 71 72 71 72 SOR 


Lee ghee: oie: Jie ® ile aa 


DEC JAN DEC JAN DEC JAN 


Fig. 3—Building an access tree. 


have an association from family to family by identity. Wherever this 
condition exists, a single refinement can describe many entities in the 
data base. This is called multiple refinement. A refinement to 


GROUP REFINE LIST 
YEAR 70 


denotes every 70 entity of Fig. 2. 


Refinements can depend on specific ancestors. This happens when 
a refine list has two or more entries. Thus: 


GROUP REFINE LIST 
CITY KANSAS CITY 
STORE PLAZA 
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identifies the PLAZA store only in KANSAS CITY, not the one in 
LOS ANGELES. PLAZA is called a dependent refinement. 

A refinement can be both multiple and dependent, hence is called a 
multiple-dependent refinement. An example is 


GROUP REFINE LIST 
YEAR 70 
MONTH JAN 


which specifies a set of 70 entities, and the JAN entities under those — 


70 entities. . 
A refinement is not restricted to immediately adjacent levels of 
the data base. The following refinement is acceptable: 


GROUP REFINE LIST 
CITY KANSAS CITY 
MONTH JAN 


The groups of a refine list must proceed down the data base from 
ancestors to descendant. However, groups may be skipped in the list.. 


4.3 The Refine Set 


A refine set is a set of refine lists on particular groups. The groups of 
the refine set may be any groups, but the first group must be an 
ancestor of all the others. Figure 3 shows a refine set on the groups. ~ 
CITY, STORE, and WAREHOUSE, and another refine set on YEAR - 
and MONTH. . 

A group can only be involved in one refine set. Every refine list 
of a set must start with an entity from the set’s first group. Hence, 
to be a legal refine list, it must proceed to give entities ancestor to 
descendant down one path of the group tree, and from groups in the » 
refine set. 

A refine set allows a multiplicity of dependent refinements. The © 
English clause . 


FOR PLAZA AND MAIN ST STORES AND WEST WARE- 
HOUSE IN KANSAS CITY AND RT 46 STORE IN TOPEKA 


is easily represented as a refine set. In fact, Fig. 3 gives the refine lists 


to do this. 
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4.4 Independent Refine Sets 


Refine sets alone cannot be used to represent independent refinements. 
The clause 


FOR CITIES 


? 


,and ___, IN THE YEARS 














? } ? 


, and —___ 





=.“ 








? 


represents five cities, and five years in each city. The list of cities is a 
refine set on one group, CITY. The list of years is a refine set on another 
group, YEARS. What the entire clause specifies is the independent 
combination of these two refine sets. This leads to the concept of 
independent refine sets. 

Refine sets are independent if their groups are mutually exclusive, 
i.e., any group can be in only one of the refine sets. Figure 3 shows a 
pair of independent refine sets and the resulting members of the access 
tree. Any number of independent refinements can be put on an access 
tree. 


4.5 Whole Inclusion 


Figure 3 has all stock items of the WEST warehouse on the access 
tree. Any generated group not in a refine set is included on the access 
tree on a whole inclusion basis. This means that whole families of the 
group are either included or excluded from the access tree, depending 
on whether their parent is included or excluded respectively. 

Using Fig. 2, other examples of whole inclusion are: 


(t) CITY group refined to KANSAS CITY, STORE group whole, 
all other groups pruned from the access tree. This puts entities 
a, g, and h on the access tree. 

(ii) CITY and STORE groups whole. All other groups pruned. 
This puts all cities and all stores on the access tree. 

(wi) CITY refined to KANSAS CITY and TOPEKA. YEAR 
independently refined to 71 and 72. STORE and MONTH 
whole. All other groups pruned. This puts entities a, b, g, h, 
and i; all 71 and 72 entities under g, h, and i; and all month 
entities under those year entities onto the access tree. 


4.6 Generating Entities on the Access Tree 


The generator accepts an access tree as an input. It generates only 
entities on the access tree. For brevity in this section we will say 
“offspring,” “sibling,” etc., but always mean “‘offspring on the access 
tree,” “sibling on the access tree,”’ etc. 
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On each call the generator takes one of the following actions: 


(A) Takes a step to the “next” entity of the access tree, opening 
that entity for data accesses. The entity reached is said to be 
generated. The client program is informed of the group of the 
entity and the entity’s identity among its siblings. 

(B) Notifies the client process that the previous entity generated 
was the last of a family on the access tree. A new entity is not 
generated on this call. This action gives the client process an 
opportunity to perform summary processing on families. 

(C) Notifies the client process that the previous entity generated 
was the last on the access tree. This action gives the client 
process an opportunity to perform final summary processing, 
and to exit from the processing loop. 


On the first call, the entity generated is the leftmost entity of the 
top group. This becomes the current entity. On subsequent calls, the 
generator tries to step from the current entity to another entity in the 
following order: 


1. Leftmost offspring of the current entity. 
2. Sibling of the current entity. 
3. Step-sibling of the current entity. 


The first of these that succeeds becomes the new current entity. If all 
fail, the current entity is redefined as the parent of the current entity, 
and the above process resumed at step 2. The effect is to continue the 
list with 


4, Sibling of the first ancestor of the current entity. 
5. Step-sibling of the first ancestor of the current entity. 
6. Sibling of the second ancestor of the current entity. 


ete. 

This process defines the meaning of “‘next”’ entity for action A. 

Actions B and C allow the client program many opportunities to 
perform processing on individual entities, summaries after families of 
entities are generated, and a summary at the end of the tree. Process 
loops are generally organized with the generator at the top of the loop. 
Following this is a section of code that tests which action was effected 
by the generator, and at what group. If an entity is generated in a 


group where retrievals are to be made, control is passed to a section of 
code that makes the retrievals from that entity and processes the data. 
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If a “done with family” action is signaled on a group for which family 
summaries are being made, control is passed to a section of code that 
effects the summaries for that group. After each of these code sections 
is complete, control is returned to the generator to take the uext 
step. Eventually the “done with tree” action is signaled. The process 
then exits from the processing loop and executes terminal processing. 

Only the generator looks at the access tree data structure, and it 
confines the process to entities on the tree. Entities not on the access 
tree simply do not exist, as far as the process loop is concerned. 

The process can direct the generator to break from its normal 
sequence of “‘next’’ entity steps. Thus further screening by data 
dependent “match” conditions of the entities to be processed can be 
done. When an entity of a particular group is reached, the client 
process can retrieve data from it and test for a match condition. If the 
match condition is satisfied, other data are retrieved from the entity 
and entered into the process. Then the generator is told to generate 
the next entity, usually an offspring of the current entity. 

But if the match condition is not satisfied, the client program goes 
directly to the generator, calling it with a skip option that causes all 
descendants of the current entity to be skipped. Normally the next 
entity generated in this case is a sibling of the current entity. Other 
skip options are available. Thus the process has final control over the 
entities entering the process, within the confines of the access tree. 


4.7 Summary of Access Trees and Generators 


The generator and access trees provide a mechanism for efficiently 
accessing in a subtree of a data base those entities which may supply 
the data needed to process a particular request. Access trees have a 
natural derivation from English clauses that delimit the scope of a 
request. The generator can directly access the entities specified by an 
access tree. Thus, together, they constitute a very significant bridge 
between natural-language query and efficient retrieval algorithms. 


V. FUNCTION EVALUATOR 


An application program interacts with a data base at each entity 
generated. For retrieval processes, the values to be displayed are often 
combinations or functions of the stored data. The function ‘““NET 
SALES per store divided by EARNINGS” has one value per entity 
of the store group. The values for this function could have been given 
a name at build time, and established as a field of the data base. In 
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theory, any possible function that has one value per entity of some 
group could be a stored field of that group. In practice, only those of 
sufficiently high usage are stored, and the others are computed on 
request. Thus there must be some mechanism which can deliver upon 
request the value of a stored field or of a function of stored fields. This 
mechanism is called the function evaluator. 

This section presents a definition of several classes of fields whose 
values are not stored but can be derived from the stored data and from 
the hierarchical structure itself. 


5.1 Summarizing a Field 


A hierarchy provides a structure for efficiently summarizing data. 
For example, a user of the sample data base may require the total 
DOLLAR SALES for each store. To obtain such a total for a given 
store, the values of “DOLLAR SALES” must be summed over each 
department in that store. Repeating this summation for each store 
produces a set of values for the derived field “total DOLLAR SALES 
per store,” defined at the store group. This type of function is called 
a level raise because it raises the level of definition of a field from one 
group to a higher group. 

The set of entities used to evaluate a level-raise function for the 
store group consists of one entity of the store group and a collection 
of descendants of that entity. A subtree under group G is defined as an 
access tree containing at most one entity of group G. Hence, in the 
descendants of G a subtree under group G may branch out, but from G 
up to the root there is only one entity path. Let G be an ancestor of G’ 
and f’ a stored or derived field of G’. A level raise produces a value for a 
field f of group G by summarizing a field f’ of group G’ across the G’ 
entities in a subtree under group G. Values entering into the level 
raise are those of f’ for entities of the subtree. The set of values it 
produces for all entities of G defines a field f of group G. 

In the level-raise function ‘total DOLLAR SALES per store,” 
“total” is an instance of a level-raise operation, ‘‘store’”’ is G, and 
“DOLLAR SALES” is f’, where G’ is the department group. In order 
to construct an efficient computation algorithm, level-raise operations 
are restricted to those which can operate sequentially on a set of 
values for the field f’ to produce a single value of f. Examples of level- 
raise operations are total, average, minimum, maximum, and standard 
deviation for numeric-valued fields; any, all, and none for logic-valued 
fields; and concatenation for character-string fields. 
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5.2 Retrieval Within an Entity 


A derived field measuring average sales might be defined as 
“DOLLAR SALES divided by SALES FORCE.” Since both com- 
ponent fields are defined for each department entity, their quotient is 
also defined for each department entity, and hence describes a derived 
field of the department group. Any function of fields is called a field 
function. The operands of a field function may be level-raised, as in 
“maximum DOLLAR SALES per store divided by EARNINGS.” 
This is a function of two store fields, ‘maximum DOLLAR SALES 
per store’ and “EARNINGS.” A field function can be defined in 
terms of fields of different groups, as in ‘DOLLAR SALES divided 
by EARNINGS.” The numerator is a department field, the denomi- 
nator a store field. The expression has one value for each department, 
and hence defines a field of the department group. 

A field function for group G is defined as any function of constants, 
fields of G, and fields of ancestors of G. These fields may be stored or 
derived. A field function produces a new field of G. It is applied at a 
single entity and produces a value defined for that entity. The class 
of field functions contains such operations as the standard arithmetic, 
Boolean, and trigonometric operations; logarithms; and IF-THEN- 
ELSE assignments. 

Arbitrary nesting of level-raise and field functions is well defined 
since a function of either class generates a field. An example of such 
nesting is “‘maximum per store of (DOLLAR SALES minus total per 
department of (PURCHASE COST times the sum of ON ORDER 
and BACK ORDERED)).” This expression is equivalent to the 
following statements: 


x = PURCHASE COST times the sum of ON ORDER and BACK 
ORDERED 

y = total x per department 

z = DOLLAR SALES minus y 

f = maximum z per store. 


x, y, Z, and f are derived fields. x is an item field, y and z are department 
fields, f is a store field. 


5.3 Entity Specification and Qualification 


The functions considered thus far generate new fields. The next 
discussion treats functions which modify the set of entities over which 
a field is evaluated. An entity-specification function describes a process 
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which, given a subtree under group G, selects another subtree under 
group G using only the intrinsic order of the entities in each group or a 
constant entity designation. In the sample data base, the years and 
months are ordered within their respective families. Therefore, the 
request “ADVERTISING divided by previous year ADVERTISING”’ 
is defined for each month. Given a particular month the numerator 
is obtained directly, whereas the denominator is retrieved for an 
entity whose location in the tree structure is determined relative to 
the given month by the operation “previous year.”’ This is called a 
relative entity specification. 

Constant entity specification denotes a fixed subtree under group G 
which overrides the given subtree. The ratio of NET SALES to 
January NET SALES describes a constant entity specification. 

Hierarchical structures make entity specification an efficient process 
for selecting the entities over which to evaluate an expression. A more 
general but less efficient selection is that of entity qualification, as in 
the “‘with” phrase of “average per year of (BACK ORDERED with 
PURCHASE COST greater than 500). Entity qualification is in- 
dependent of the order of entities in a group. All entities must be 
examined according to a criterion, such as “PURCHASE COST 
greater than 500.” Each entity is assigned the value “accept” or 
“reject.’”’ When an entity is rejected, all of its descendants are rejected 
as well. The descendants of an accepted entity are likewise accepted 
as far as that criterion is concerned. The qualification process is 
inefficient because data must be retrieved for all candidate entities; 
in entity specification no test data are retrieved from any entities. 
Hence, the earlier example with a January specification might be 
equivalently phrased ‘“NET SALES divided by total NET SALES 
with MONTH NAME = JAN.” In the denominator, each month 
entity must be examined to determine whether or not its name is 
January. Although constant entity specification can be contorted 
into entity qualification if entity identifiers are stored values, the 
relative entity-specification functions, such as ‘‘previous,’”’ cannot be 
expressed at all with entity-qualification, unless the family-order 
relations are also stored as data values. 

In summary, level-raise and field functions can be computed for 
all entities of a group. A function of either type produces one value for 
each entity of a group, and hence defines a nonstored field of the group. 
Entity specification and qualification functions produce a subtree at 
each entity of a group. A function evaluator enables the user of an 
interactive data system to dynamically define and redefine derived 


1708 THE BELL SYSTEM TECHNICAL JOURNAL, DECEMBER 1973 


fields and retrieve values for these fields, all in an interactive communi- 
cation in real time. A field is either a data-base field or a function of 
fields, such that it has one value for each entity of some group. A 
function evaluator enables a client program to retrieve values of a 
field without having to distinguish whether the field is stored or derived. 
To accomplish this, a function evaluator must be able to accept 
function definitions during the dialogue, rather than have them com- 
piled into machine-executable code. In addition, it must be capable of 
evaluating arbitrarily nested functions if the user can truly ignore 
distinctions between stored and derived fields. Otherwise the user 
would be constrained to use only specific types of fields in each class of 
functions. 


5.4 The Retrieval Process 


This section presents an algorithm for an evaluator capable of 
computing values for derived fields over a hierarchical data base. The 
algorithm is a recursive procedure. 


Input: A four-tuple: (f, G, t, §). 
Arguments: 


f: A field defined at group G. 

G: A group. 

t: An entity-selection function for group G. 
S: A subtree under group G. 


The argument f can be a stored field, v; a field function, p; or a 
level-raise function, J. 


p: A field function whose nth argument is the triple (fn, Gn, tn) : 
f,: A field for group Gn. 
G,: A group, either G or an ancestor of G. 
tz: An entity-selection function for group Gn. 
l: A level-raise function with three arguments: 
f’: A field for group G’. 
G’: A descendant of G. 
t’: An entity-selection function for group G’. 


The argument t can be an entity-specification function, s, or an 
entity-qualification function, m. 


s: An entity-specification function with one argument: 
t;: An entity specification for group G. 

m: An entity-qualification function with three arguments, having 
the same definition as a triple in p, above. 
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The algorithm for this evaluation is as follows. 


1. If t has the form s(t), perform the specification function s on S 
at G to produce a new subtree §i; then evaluate (f, G, ti, 81). 
Return. 

2. If t has the form m/(f1, Gi, ti), set t’ to null, evaluate (f1, Gu, t’, 8), 
and perform the qualification function m on the result to produce 
Si. If 81 contains an entity of G evaluate (f, G, t1, 81) and return; 
otherwise return a null value. 

3. If fis a stored field, v, retrieve its value for the entity of group G 
on 8. Return. 

4. If f has the form p{ (f1, Gi, 1), (fe, Go, te), ---}, do step 4a for 
each component (f,, Gn, tr); apply p to the resulting values and 
return. 
4a. If G, = G, evaluate (f,, G, tra, $); otherwise construct S, 

as the subtree under group G, containing the entity of Gn 
present on 8 and containing all of that entity’s descendants, 
and evaluate (fn, Gn, tr, Sn). 

5. If f has the form [(f’, G’, t’), select S’ as a subtree of S under group 
G’, containing the first G’ entity of S. Evaluate (f’, G’, t’, 8’). 
For each succeeding entity of G’ on §, do step 5a. Return. 
5a. Construct a subtree S’ for group G’ using the Q’ entity 

specified in step 5, and evaluate (f’, G’, t’, S’); then apply / 
to the previous result and the new value. 


This algorithm is summarized by the following production. 


(f, G, t, 8) ae (i, G, s(t), 8) | 
(f, G, m(fi, Gi, ti), 8) | 
(v, G, t, S) | 
(p{ (f1, Gi, ti), (fo, Go, te), pe py t, S) | 
((f’, G’, t’), G, t, 8) 


5.5 Unavailable Data 


Some of the fields in a data base may not have values at some time. 
For example, a new stock item, X, may be ordered although its selling 
price has not yet been determined. Now someone designing a new 
product using parts X, Y, and Z needs to determine the total selling 
price of the components, that is, a summation level raise restricting 
items to X, Y, and Z. Clearly, if the value of ‘selling price” is un- 
available for X, then the value of the sum is also unavailable. Should 
an unavailable unit of data be assigned the value zero, the level raise 
would produce 0 plus Y plus Z as the material cost of the product—an 
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alarming situation at best. Similarly, the field function “selling price 
times IN STOCK” must yield a result of unavailable if the value of 
either operand is unavailable. Notice that this situation is not one in 
which the user has entered a value of null, but rather one in which the 
data are not available or have not been entered. NA (not available) in- 
dicates that no significant data are present. 

Unavailable values occur as well in logical-valued fields, particularly 
when level-raising with operators of ‘any’ and “‘all.” “Any X”’ has 
the value “‘true”’ if the field X has the value “true” for any descendant 
of the current entity. ‘‘All X”’ is true only if X is true for all descend- 
ants. If X has a value of NA (not available) it could be true or false, 
but we do not know which. Representing TRUE, FALSE, and NA 
numerically such that TRUE < NA < FALSE, an investigation of 
each. possible situation will verify the following: 


any X = min (X) 
all X = max (X) 
any not X = not all X 
all not X = not any X 
where: not NA = NA, 
not TRUE = FALSE, 
not FALSE = TRUE. 


In criteria evaluation, such as testing if X is less than Y, the result 
must be NA if the value of either X or Y is NA. If either value is 
unknown, the criterion may or may not be satisfied; the result is 
unavailable. 

NA is a value which describes the absence of a value. Entity- 
qualification functions produce an accept or reject status. ‘“‘Reject”’ 
describes the absence of an entity. In “average (ON ORDER with 
PURCHASE COST greater than 500) per department” the qualifier 
rejects entities in the averaging. “Reject”? is needed as a value of 
stored and derived fields as well. It enables IF-THEN-ELSE state- 
ments to express entity qualification. Moreover, suppose that before 
April of a certain year the entire sales of a department was recorded 
in the NET SALES field, while after that time the sales were broken 
down into NET SALES and SALES TAX. Now if SALES TAX is 
given the value zero in the first months, the expression ‘‘average tax 
per year” for any department will produce a peculiar result because 
of the zeroes averaged in. NA is unacceptable as the value of SALES 
TAX in the early months since it would cause a field such as “average 
(NET SALES plus SALES TAX) per year” to return a value of NA, 
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although the true value is well defined. Instead the stored value 
“reject’”’ is used. Operationally ‘“‘reject’’ is the identity for PLUS, 
MINUS, AND, and OR. For other operations, if any operand has the 
value “reject”’ the result is also “‘reject.’’ Hence, when adding 5 to 
“reject” the result is 5, and when testing whether or not 5 is less than 
‘“‘reject”’ the result is ‘‘reject.” 


VI. POINTER AND DATA STRUCTURE 


The previous sections have defined Master Links from the user’s 
point of view. To implement the features described, and to achieve 
the other stated goals (high performance in a time-sharing environ- 
ment, portability, and multiple concurrent users), require a new 
approach to the layout of the data base elements onto the host systems 
files. In the classical approach to data-base design, records are used 
for many purposes. One purpose is to associate data values; another 
is retrieval efficiency: data values used together are stored together 
in a record. Update interlocking is a third use: exclusive control of a 
record or set of records is granted to a process so that it may make a 
series of changes to their contents without interference from other 
processes. 

Master Links provides three distinct tools to achieve these three 
results, without having to rely on physical-storage records: 


(2) Association of data items is accomplished by the pointer 
structure described in Section 6.2.1. 
(7) Retrieval efficiency is achieved by a parametrized layout of 
the data values into a data block, Section 6.2.2. 
(z2t) Multiple concurrent updates by many users is made possible 
by the concept of a lock unit, described in Section 6.1. 


With Master Links, programs (and people) work with the logical 
structure of a data base, unhampered by its physical layout on the 
direct-access files. The details of record and file boundaries are in- 
visible at the logical level. The basic concepts of Master Links, as well 
as all or most of the detail logic that implements the concepts, are 
independent of any machine or host system. 

The mechanism used to achieve this freedom is the stream. 


6.1 Streams 


A word is an arbitrary unit of storage, the meaning of which is 
determined by the host system. A stream is a series of words. A particu- 
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lar word in a stream is identified by its position in the series. This 
ordinal number is called WIS (word in stream). The size of a stream 
in words may be increased or decreased to accomodate changes in 
data-base size. A data base is built from several streams. A stream 
therefore needs an identifier, which is designated S. A particular word 
of a data base is completely determined by S and WIS. A stream is 
made up of a series of records, where a record is defined as a set of 
words transmitted between primary and secondary storage by the 
host system as a single unit. A pointer into a stream is always in terms 
of WIS, never in terms of record number, or word in record, or any 
other host-system concept. 

Exclusive use of a segment of a stream, called a lock unit, is re- 
quired for updates. In fact, the lock unit may be a record, a set of 
records, a file, etc., depending on the capabilities of the host file- 
management system. The interlocking of multiple concurrent updates 
anywhere in the data base occurs correctly regardless of the boundaries 
of the lock units. A lock unit is, from the traffic point of view, a re- 
source. It is important that as few lock units as possible be locked for 
the shortest time possible in servicing an update, and that a lock unit 
cover the smallest possible area. 

One can plan efficient use of streams in terms of 8 and WIS alone. 
The probability that two words, WIS and WIS + K, of the same 
stream are in the same record is 1 for K = 0 and linearly decreases to 
zero with the magnitude of K. That is, words close together in the 
stream are likely to be in the same record. The same is true of two 
words and a lock unit. Thus, by adopting a probabilistic viewpoint, 
efficient use of streams can be planned without a detailed knowledge 
of file and record boundaries. 

Streams are implemented in the Master Links software using 
direct-access files. Catalogued, direct-access files with a fixed number 
of unformatted, fixed-length records are used because such files are 
generally available and operate efficiently on existing time-sharing 
systems. This is the simplest and most commonly available type of 
direct-access file available today. A file set is a (possibly null) series 
of direct access files. Each file of the series has the same dimensions: 
RPF records per file, and WPR words per record. A file of a file set is 
identified by its ordinal position in the series; this number is called 
FIFS (file zn file set). A file set forms a series of computer words when 
the files are viewed as logically concatenated in the order of their 
FIFS numbers, with the records of each file being logically concate- 
nated in the order of their record numbers. Thus, graphically, a file set 
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can be pictured as follows: 


til oT see 18 ee RPP [eee | a) 








\}<—_—FIFS = 1—>|<—F IFS = 2—>| --- |e—F IFS = K——> | 


Kach box represents a record ; its record number is shown inside the box. 
This construction does not imply that the files, or even the records of a 
file, be physically concatenated on secondary storage. The actual 
allocation of files upon direct-access devices is a responsibility of the 
host file-management system. A file set can grow, and the unit of 
growth is a file. A new file is assigned the next ordinal number available 
for the file set to which it is assigned. A file set is, therefore, a finite but 
extensible series of computer words, and hence is an implementation 
of a stream. Several file sets are used to implement the several streams 
of a data base. 

To access word WIS in stream §, the file set for S is determined 
from §. Then the dimension RPF and WPR are determined. By 
integer division and modulo arithmetic on WIS, the FIIS, the record 
number, and the word in record are calculated. Thus the word is 
described in terms of files and records, and can be accessed. 


6.2 Pornter Structure 


This section describes the pointer structure of Master Links. The 
design derives from the following goals: 


(¢) The data structure must be designed for auxiliary storage. 

(22) Data may be updated and elements added to and deleted from 
the hierarchy by simple, efficient algorithms. 

(2272) These operations serve multiple concurrent users. 

(2v) The integrity of the data structure must be maintained in the 
event of a machine failure. 

(v) A single set of algorithms must access any hierarchical data 
base. 

(vt) The storage of the hierarchy must provide efficient hierarchical 
traversal; that is, at any position in a hierarchy, the accessing 
routines must be able to directly address any subordinate or 
sibling. 


6.2.1 Development of the Pointer Structure 


In Section II, entities were described as having data and structure. 
Structure connects an entity to its relatives. In order to attain efficient 
traversal from an entity to any of its siblings or offspring, regardless 
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of the number of offspring, the structure part of the entities in a 
family must be stored contiguously. Otherwise, a sequence of reads 
would be needed to follow the chain of sibling pointers through 
auxiliary storage. For families to grow in real time and still have their 
members contiguous, the growth process requires copying the old 
family description to some available space and then appending new 
entities. If data of an entity were stored with the structure, this copy 
process would become expensive and would leave large amounts of 
space vacant. Rather, the data are stored separately in a matrix, or 
data block. The columns of the matrix correspond to entities, the rows 
to fields. Therefore, the structure information for an entity must 
include a reference to the data-block column number assigned to 
that entity. 


STRUCTURE 
INFORMATION 


; 


Data Block 
Column Numbers 
in Entity Structure 
Information 


Data Block 





Column numbers, rather than absolute storage locations, are used to 
reference the data block, allowing separation of structure from data. 

The hierarchical structure is completely divorced from the data 
storage structure. Whichever way the matrix is stored—by row, by 
column, by submatrix—has no bearing on the hierarchical structure 
information. The Master Links data-block storage arrangement is 
discussed in Section 6.3. 

The offspring of an entity are linked to the entity by means of 
pointers which specify their storage locations in a stream. Every 
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entity has one offspring pointer for each family of offspring. The 
collection of the data-block column number and the offspring pointers 
for one entity is called an entity potnter set. 


data-block 
column 
number 

(one word) 















pointer to 
another 

offspring family 

(one word) 


pointer to 

an offspring 
family 

(one word) 














An entity pointer set (EPS) contains the structure information for an 
entity in a contiguous stream of words. 

Since each entity in a group has the same number of offspring 
families, the entity pointer sets for all entities in one group are the 
same size. Therefore, since siblings are adjacent, a pointer to the 
beginning of a family provides direct access to each member of the 
family. If the siblings were chained together, a null pointer in the 
chain would indicate the end of a family. However, with the siblings 
contiguous, the size of each family must be stored instead. It is most 
convenient to store the family size just before its first entity pointer set. 





A family of n entities is described by a one-word family size, n, followed 
by 7 entity pointer sets (EPS), one for each entity in the family. 

The entity index is the ordinal position of the entity pointer set in 
the physical pointer structure for one family. In deleting an entity 
from a family, it is important not to change the entity index of other 
entities in the family. Therefore, a special flag is encoded in the entity 
pointer set to show that the entity is deleted. Entities which are 
physically present in a stream but which have been flagged as deleted 
have reserved status. They are not considered present in the hierarchy. 
If the delete flag of a reserved entity is later turned off, the entity 
becomes active and is then treated as part of the hierarchy. 

All the family pointer sets of a group are stored in a stream. This 
stream is called the master link of the group. 

In summary, the description of the entities in a group is stored 
in a stream of words. A group is made up of families. A family is 
described by a family size, followed by a set of pointers for each entity 
in the family. An entity pointer set consists of a data pointer and a 
collection of offspring pointers. The data pointer is a column number 
of the data block for the group. The offspring pointers specify word 
positions of the appropriate streams. Entities allocated in the pointer 
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structure are either active or reserved; the latter do not appear in the 
logical structure of the data base. 


6.2.2 Pointer Structure Algorithms 


The algorithms which modify the data-base structure must be safe 
over abnormal termination of the process. A process can be abnormally 
terminated in many ways, such as by a user interrupt, a hardware or 
software failure in the host environment, or an intentional stop by the 
host system for exceeding some resource allocation. The key to making 
a transaction safe over unexpected terminations is to first allocate 
any new space needed, then fill out the new space, and finally link the 
new space with the old by a single pointer. If the write of that pointer 
succeeds, the new information is secure. If it fails, the area remains 
disconnected and wasted, but the data structure remains intact. 

The algorithms must also work correctly when several concurrent 
users are trying to execute them. This is assured by locking a word to 
be updated (the lock unit must cover the entire physical record 
containing that word), reading the record to obtain a fresh copy, 
updating the value and any other values in the same record, re-writing 
the record, and then relinquishing the lock. 

There are three functions which modify the pointer structure. An 
entity’s status can be reversed (from active to reserved or back 
again) ; new entities can be added to an existing family; and a family 
can be created and linked to its parent. 

To reverse an entity’s status, the stream location of the first word 
of the entity pointer set is computed. The process then requests of the 
host environment exclusive control of the lock unit containing this 
word. When exclusive control is authorized, the record is read, the 
required word is updated, the record is written back to auxiliary 
storage, and the exclusive control is relinquished. If the process is 
terminated before the write, it can be re-executed because nothing 
has been altered. If it is terminated after the write, a restart procedure 
can read the record to determine that the update was successful, and 
skip re-doing it. 

Extra entities are added to a family by first locking the record which 
contains the word pointed to by the parent’s offspring pointer. This 
word, called the linkage word of the family, is the single word to be 
made a pointer to the new space. The first time that this algorithm 
is applied to a given family the linkage word contains the family size. 
Next, a sequence of contiguous words in the stream is allocated and 
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locked. The existing entity pointer sets of the family are copied into 
the new space; new entity pointer sets are constructed by generating 
parent pointers, new data column numbers, and null offspring pointers; 
any of the new entities that are to be reserved for future assignment 
are marked reserved; and, finally, the new family size is placed into 
the first word of the new space. After this new area has been written 
and unlocked, the linkage word is updated to point to the new area. 
The record containing the linkage word is then written and unlocked. 
This update of a single word links the new family pointer set to the 
existing hierarchy and disconnects the old family description from it. 
The whole process is called a bubble-out; the wasted space containing 
the old family description is called a bubble. 










parent’s 
offspring 


stream of parent 
of bubbled-out 
nn family 


sone 
word EPS: |] EPS. EPS, 


The linkage Sear is a single word connecting the new family descrip- 
tion to the previously existing structure. 

Notice that the linkage word had to be locked from the start to the 
finish to keep other users from adding to the same family at the same 
time. Such interference could cause the family update to be lost 
entirely. 

Creating a new family adds one complication to the bubble-out 
algorithm. Here, there is no linkage word, so the parent’s offspring 
pointer must be locked and re-read instead. If the parent’s offspring 
pointer to this group is still null after locking and re-reading, it is 
updated to point to the new family. If non-null, another user has in the 
meantime attached a family to the parent, so either the status-change 
or the bubble-out algorithm is entered. 




















new 
family 
size 


stream of 
bubbled-out 
family 





6.2.3 Further Considerations 


There are several considerations which affect the performance of 
the bubble-out algorithm. Among these are the handling of available 
space, the disposition of bubbles, and the use of an entity reservation 
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factor for assigning sets of entities at one time. These considerations 
involve important balances between execution speed and auxiliary- 
storage space. 

For each master link, the WIS for the next available word of the 
stream is stored as a header to the master link. The bubble-out algo- 
rithm first locks the header, then the linkage word. The new family 
size is calculated and the header is updated, written, and unlocked. 
Then each record to be written is locked, written, and unlocked before 
the next one is locked. Since the linkage word is already in use it 
must lie somewhere between the header and the available space. Hence, 
the locations locked are within one stream, and in numerically in- 
creasing order of WIS. This precludes any chance of a deadlock since 
a stream is stored in an ordered set of records. As for the bubbles, a 
utility can be run in the background from time to time to remove 
them. The bubble-out algorithm assigns column numbers to the new 
entities, so it must update an available-data-block-column word at 
the same time it updates the stream available-space word. Hence, the 
appropriate place to store this available-column number is the master 
link header. 

A parameter of the bubble-out process allows the reservation of 
extra (inactive) entities. Assigning the extra entities causes only one 
bubble-out for all of the entities created, releasing only one bubble. 
The status-change algorithm is considerably more efficient than the 
bubble-out algorithm, so the average entity creation cost is reduced. 
The bubble-out algorithm assigns data-column numbers to the new 
entities, so data for the entities of a family are stored in sets of con- 
tiguous columns. As explained in the next section, this usually makes 
data accessing more efficient than if the columns of a family were 
scattered throughout the block. The cost of reservations is the cost of 
carrying the extra entities in storage before they are activated. The 
reservation factor can specify a constant increase or a growth factor 
as a percentage of the current family size. 


6.3 Data Blocks 


Data-base processes have a strong tendency to access either values 
of many fields from a few entities, or of a few fields from many entities. 
In the latter case the entities tend to be requested in an order deter- 
mined by the hierarchy of the data base. A given data base will have 
a mix of these two types. If the first type predominates, it is efficient 
to order the values column-wise. If the second is more common, 
efficiency is gained by arranging the values row-wise, and by assuring 
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that entities processed together occupy, with high probability, adjacent 
columns. 

For Master Links, the mechanism for storing values is the data 
block. A data block is a matrix of values with one column for each 
entity, and one row for each field. An element of this matrix is one 
value, which takes up one or more words. The number of words needed 
to store one value is a parameter of the field, called its s¢ze. Thus all 
the values of one row are of the same size, but the values in two 
different rows may have different sizes. For each group of a data base 
there is one data block. The arrangement of a block into records is 
controlled by several block parameters which- are attributes of the 
corresponding group. These parameters provide a variety of possible 
structures, of which the column-wise and row-wise layouts are special 
cases. Using the block parameters as inputs, a single algorithm can 
access any block arrangement. 


6.3.1 Layout of a Data Block 


A data block is stored in one stream. The block has two parameters, 
words per column, WPC, and columns per subblock, CPSUB, which 
are used to divide the block into subblocks. WPC is an integer equal 
to the sum of the sizes of the fields. This is the vertical dimension, in 
words, of the block, and also of the subblocks. The parameter CPSUB 
defines the horizontal dimension of a subblock. The first subblock 
consists of columns 1, ---, CPSUB; the second is columns CPSUB + 1, 
-++, 2-CPSUB; ete. A block is then a horizontal concatenation of 
subblocks. 

The ordering of words in a block is established by keeping the words 
of a multiword value together, and arranging the values in row-wise 
order within a subblock, and then concatenating the subblocks from 
left to right. This is the ordering used to store a block in a stream. 
The order of values in a block is illustrated below. Each solid arrow 
indicates contiguous storage of CPSUB values. 


|<— CPSUB—>|<— CPSUB—> |< 





| 3 ; data block 


subblock, subblocke 
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It should be noted that the subblock plays no further role except to 

establish an ordering. It does not correspond to a file, a record, or any 

other host-system concept. In fact, the mapping of block words onto 

stream words is performed without concern for record or file boundaries. 
6.3.2 Lxample of Block to Stream Mapping 


Consider a block with three rows and five columns, where SIZE 
(words per value) is set to 1, 2, and 1 for rows 1, 2, and 3, respectively. 
Suppose a numeric value takes one word, and character value takes 
one word for every four characters. Then a sample of the block looks 
like: 


1.00 2.00 3.00 
AAAAAAAA | BBBBBBBB | CCCCCCCC 
10 20 30 

4.00 
DDDDDDDD | EEEEEEER 
40 


Words per column (WPC) is four. The mapping of this block onto a 
stream is shown below for three different values of CPSUB. 


CPSUB = 1: 
AAAA | AAAA BBBB | BBBB 


CCCC | CCCC DDDD] DDDD 
EERE | EEEER 
CPSUB = 5: 


AAAA | BBBB | BBBB 
CCCC [ CCCC | DDDD] DDDD | EEEE 

























CPSUB = 3: 


AAAA | AAAA | BBBB| BBBB | CCGG 
CCCC | 10 [20 { 30] 4.00] 5.00| = | DDDD | DDDD 
EEEE|EEEE || [40] 50[ | 
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6.3.3 Accessing a Value from a Data Block 


For the nth row of a block, SIZE, is the number of words for one 
value in the row, and DIC, is the displacement zn column of the row 
which is defined as one plus the sum of the SIZE’s of the first n — 1 
rows. Thus for the example in Section 6.3.2. : 


n SIZE (words) DIC (words) 

1 1 1 

2 2 2 (=1+SIZE)) 

3 1 4 (=1+ SIZE, + SIZE.) 


The inputs to the algorithm for accessing a value of a data block 
are the identifier of a field, and a column number, COLNO. DIC, 
SIZE, and the group of the field can be determined since they are 
attributes of the field. WPC, CPSUB, and the stream identifier, 8S, 
are then determined since they are attributes of the group. From these 
the following calculations are made using integer arithmetic: 


SUBBLKS = COLNO/CPSUB 
WPSUB = WPC-CPSUB 
WORDSABOVE = CPSUB. (DIC-1) 
WORDSLEFT = ((COLNO-1)modulo CPSUB) :SIZE 
WIS = 1 + SUBBLKS-WPSUB + WORDSABOVE 
+ WORDSLEFT 


SUBBLKS is the number of subblocks previous to the subblock 
containing the sought value. WPSUB is the words per subblock. 
WORDSABOVE is the number of words above the sought value in its 
subblock. WORDSLEFT is the number of words to the left of the 
sought value in its row of the subblock. WIS is the word in stream of 
the first word of the sought value. Hence 8 and WIS and SIZE are 
known. These are the inputs needed to access a stream, as described 
in Section 6.1. 


6.3.4 Row-wise, Column-wise, and Intermediate Layout 


Note that when CPSUB equals 1 the order of storage is column-wise. 
When CPSUB equals the words per record, storage is row-wise. An 
intermediate setting of CPSUB between 1 and WPR will for certain 
usage patterns achieve performance superior to either column-wise 
or row-wise organizations. This is illustrated in the following example. 
Suppose that a block has 100 rows and 100 columns. Suppose that 
process R uses all the data in one row, and that process C uses all the 
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data in one column, and that these processes are run equally often. 
Suppose also that WPR = 100. Then if CPSUB = 1, C must read 
one record, R must read 100 records. The average records per process 
run is 50.5. If CPSUB = 100, C must read 100 records whereas R 
reads only one, for the same average. If CPSUB = 10 the block is 
divided into a 10 X 10 checkerboard of records. Each process must 
read 10 records for an average of 10 records per process. This is the 
optimum CPSUB for this example. 

A utility called CONVERT can be used to change a block from one 
value of CPSUB to another. Modifying CPSUB adjusts the data base 
to reflect a changed or unpredicted pattern of usage. It also makes 
possible periodic changing of the data layout to conform to a cyclic 
pattern of usage. All programs accessing a data block do so in terms 
of column numbers and fields. The assignment of a value to a block, 
row, or column is unchanged by CONVERT, and hence no program 
is invalidated. 


6.4 Review of the Advantages of Streams 


The process of design involves constructing transformations to 
achieve a desired structure using available structures as media. The 
desired structures for Master Links are a hierarchy and data blocks. 
The transformation is carried out in two steps, from direct-access 
files into streams and from streams into blocks and pointer sets. The 
structures and their attributes are summarized in this table: 


STRUCTURE ATTRIBUTES 
Catalogued Direct-Access Internal Identity (FSNO, FIFS) 
Files NAME 


Records Per File (RPF) 
Words Per Record (WPR) 


Streams Stream Identity (S) 
Word In Stream (WIS) 


Blocks Block Identity (B) 
Words Per Column (WPC) 
Columns Per Subblock (CPSUB) 


ROW n 

SIZE for ROW n 

Displacement In Column (DIC) For 
ROW n 


Column Number (COLNO) 
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It is no accident that streams, the intermediate structure, are so 
simple. They amount to an idealized direct access media. The advan- 
tage of using this intermediate structure is that it crystalizes the 
separation of the Master Links structures from the physical-storage 
media. The programs that implement the desired structure are coded 
independent of the actual direct-access media. In particular, the 
parametrized layout of a block would be very cumbersome to imple- 
ment directly in terms of files, records, and word in record. It is very 
straightforward in terms of word in stream. 

Since the Master Links structure is separated from the physical 
media, media management utilities such as CONVERT can be run 
without altering any Master Links programs. The separation of 
structure from media also makes possible the implementation of 
alternative media. Streams might be implemented as arrays in primary 
storage for small data bases, or implemented in an entirely different 
manner upon direct access files, such as with all streams in one exten- 
sible file. Finally, this separation enhances the portability of Master 
Links allowing most of the logic of the system to be based on a machine- 
independent direct-access structure. 


VII. EXPERIENCE AND FUTURE EXTENSIONS 


An experimental version of Master Links was operational in 1970. 
It was based on the concepts and supported all the features reported 
in this article, except portability and certain utilities. A production 
version was completed in May 1972. It supports all features, including 
portability, all utilities, two different stream implementations, plus 
improved performance. These versions have been used for a variety 
of different types of projects: inventory, financial, budget and resource 
allocation, and construction program administration data bases. 
Together with the Natural Dialogue System! it forms the basis of the 
Off-The-Shelf-System.? 

Several efforts are under way to extend and improve the system: 


(2) Networks—allowing a group to have more than one parent. 

(22) Field length data—allowing strings of data, such as a time 
series of values or a paragraph of text, to be stored efficiently 
as a single value. 

(17) Function evaluation—computing in parallel all requested level 
raises that are defined over a common subtree. Hence, in 
“total IN STOCK divided by total ON ORDER,” the numer- 
ator and denominator totals will be taken simultaneously in a 
single pass over the item entities of a department. 
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(iv) Access tree generator—allowing execution-time determination 
of the hierarchy. Suppose, for instance, that a new item field, 
item class, describes the level of supervision required to approve 
acquisition of the item. Then “total IN STOCK by item 
class” is a meaningful function, but the hierarchy formed by 
partitioning item entities according to their values of item 
class must be computed at execution time, if the “by’’ field is 
allowed to be arbitrarily specified by the user. 

(v) Report generator—accepting a description of the content and 
layout of a report and on request producing an instance of the 
report. 
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Information Management System: 


The Natural Dialogue System 


By B. W. PUERLING and J. T. ROBERTO 
(Manuscript received October 5, 1972) 


The Natural Dialogue System (NDS) is a software system designed 
to permit the easy implementation of time-shared computer programs 
which employ sophisticated forms of man-machine dialogue to converse 
with members of a nonprogrammer user audience. The heart of the system 
is a syntax-directed translator which recognizes user input messages and 
translates them into an internal text of integers for use by the program. 
NDS allows the language designer to specify the syntax of the statements 
in his language, the form of their translations, methods for diagnosing 
errors in user’s input, diagnostic messages to be generated, and the style 
of dialogue which will exist between the programs and their users. This is 
accomplished through a dialogue description and a language description 
consisting of syntactic specification elements with semantic procedures 
embedded within them. Use of NDS allows the language designer to 
produce an interactive language which is tatlor-made for both his users 
and his programs. NDS relieves the language designer of the necessity of 
writing a complex message analyzer, thereby substantially reducing the 
effort required to produce systems that offer these forms of man-machine 
dialogue. Furthermore, use of NDS allows such systems to be implemented 
by less sophisticated programming talent than would otherwise be 
necessary. 


I. INTRODUCTION 


The Natural Dialogue System (NDS) is a tool to aid programmers in 
the implementation of time-sharing-based computer systems which 
employ keyword-oriented languages and a variety of styles of man- 
machine dialogue to converse with members of a nonprogrammer user 
audience. By keyword-oriented languages we mean languages of the 
type illustrated by Sinowitz! and suggested as an alternative to natural 
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language for communicating with information management systems 
by Chai.2 NDS provides the designer of such a language with the 
ability to define the syntax of the statements in the language, the 
forms of their translations, methods for detecting errors made by users | 
of the language, and diagnostic messages to be sent to users when 
errors are detected. In addition, NDS provides facilities for the 
language designer to specify the style of dialogue which will exist 
between the system and its users. 

NDS has been operational on an experimental basis since 1970. It 
has been implemented under five different host operating systems 
(including one batch system). Its primary use has been in the area 
of interactive query languages for information management systems, 
including inventory management systems, a budget control system,? 
a work force administration system, information retrieval systems 
based on surveys of financial and equipment data, and a general- 
purpose hierarchical data base management system. Other uses have 
included a data checking specification language, a report generator 
composition language, and bulk data input/output format specifica- 
tion languages. NDS has also led to further work in the area of tools 
for interactive language design presented by Heindel and Roberto.® 

In order to get a feel for what can be done using NDS, some concepts 
concerning styles of man-machine dialogue are presented, followed by 
a description of the styles of dialogue obtainable using NDS. NDS 
itself is then described, including an overview of the modules of the 
system and certain details concerning the specification of systems to be 
implemented using it. 


II. DIALOGUE CONCEPTS 


In making conversational software systems available to non- 
programming audiences for purposes of information retrieval and 
problem solving in general, a broad spectrum of conversational styles 
has evolved. At the extreme ends of this spectrum we have machine- 
initiated dialogue and user-initiated dialogue. In the machine-initiated 
style, the user is asked questions by the computer. These questions 
are designed to find out, in an orderly way, what the user wishes the 
system to do for him. In the user-initiated style, the user presents his 
problem to the system and directs its action. The two styles are best 
illustrated by example. 

(t) Machine-initiated dialogue. 
Computer: WHAT IS THE VALUE OF EXPENSE? 
User : 50000 
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Computer: DO YOU WANT THE VALUE OF PROFIT? 
User: YES 
Computer: PROFIT IS 40000 
(7) User-initiated dialogue. 
User: EXPENSE IS 50000. WHAT IS PROFIT? 
Computer: PROFIT IS 40000 


Coupled with these different styles of dialogue is the problem of 
conversational dynamics where at some point in time during the 
conversation the subservient participant wishes to seize the initiative. 
For example: 


(t) User seizes initiative from computer. 
Computer: WHAT IS EXPENSE? 
User: IGNORE EXPENSE. REVENUE IS 80000. 
(21) Computer seizes initiative from user. 
User: TAX IS 5%. WHAT IS PROFIT? 
Computer: PROFIT CANNOT BE COMPUTED YET, 
WHAT IS EXPENSE? 
User : 50000 


With either of these styles, a user is apt to input information which is 
syntactically or semantically incorrect. Input of this nature should not 
cause the conversational program to abort. On detecting invalid input 
a conversational program may output terse messages such as 
“WHAT?” or “SYNTAX ERROR” and then invite re-entry of the 
test in question. Alternatively, programs may output lengthy explana- 
tions of valid replies and again ask the user to continue his input. The 
nature of handling invalid input depends primarily on the experience 
level of the end-users as well as the experience level of the person 
implementing the conversational software. In general, a language 
designer should have the tools at his disposal to tailor his language, 
including handling of invalid input, to correspond precisely to the 
environment in which it will be used. 

In general, a transaction between man and machine can be viewed 
as a consulting effort between two ‘‘experts’’: 


(2) the machine, which is an expert in delivering facts or computing 
results based on input data, and 

(27) the person, who is an expert in the problem to be solved, the 
environment in which the problem arose, and certain subjective 
considerations of the possible solutions. 


In this situation if the person is burdened by certain conversational 
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constraints, the creative, exploratory environment may suffer. In other 
words, the conversation between man and machine should be made as 
natural as possible for the person. With this in mind, NDS offers a 
variety of dialogue styles which encompass most of the initiative 
spectrum with the emphasis on the user-initiated end. 


III. STYLE OF NDS DIALOGUE 


A user’s message or request to a system using NDS consists of a 
series of statements separated by colons. Each statement in the 
language consists of a unique keyword followed by a sequence of 
characters called the clause of the statement. A message is ended by 
the statement GO:. A message to an information management system 
might be: 


PRINT ITEM NUMBER, SELLING PRICE/PURCHASE 
COST, REORDER DATE:WHEN AMOUNTINSTOCK > 1000: 
IN ALL DEPARTMENTS: GO: 


This message consists of three statements with the keywords PRINT, 
WHEN, and IN respectively. The PRINT statement fills the same 
role as a verb in the English language since it directs the information 
system to print the information specified in its clause. In general, a 
user’s message must contain a single verb statement. The WHEN 
and IN statements act as modifiers (adverbial or prepositional) of 
the PRINT verb. In general, a user’s message contains zero, one, or 
more modifier statements. The statements in a message can be given 
in any order since NDS does not consider a message to be complete 
until the GO statement is encountered. 

An important feature which NDS offers is the automatic edit mode. 
NDS remembers the state of the dialogue from message to message. 
Once a statement is correctly given by a user, that statement remains 
as part of the ‘‘current’”’ message until the user deletes it, or replaces it. 
Therefore, after the system acts on the above request, the user may 
continue the dialogue by typing: 


WHEN AMOUNT IN STOCK > 2000: GO: 


The PRINT and IN statements which were given as part of the first 
message are carried over as part of the second message. Therefore, 
to NDS the second message becomes: 


PRINT ITEM NUMBER, SELLING PRICE/PURCHASE 
COST, REORDER DATE: WHEN AMOUNT IN STOCK 
> 2000: IN ALL DEPARTMENTS: GO: 
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Once a verb statement is correctly given by a user, that statement 
remains as part of the current message until the user deletes it, replaces 
it, or enters the statement for a different verb in the language. Thus, 
following action on the second message, the user may continue his 
dialogue by typing: 


DISTRIBUTE ITEMS BY SELLING PRICE: GO: 


The IN statement from the first message and the WHEN statement 
given in the second message are carried over as part of the third 
message. If the user continues his dialogue by inputting a statement 
whose clause has an invalid construct according to the definition of 
the clause given by the language designer, the system will print a 
diagnostic message (possibly language designer defined) and remove 
the statement from the current state of the dialogue. 

In addition to verbs and modifiers, a language may contain one or 
more special statements termed dialogue service statements. These 
statements usually take the form of aids to the user of a language 
or debugging tools for the language designer. Services may be included 
which provide the user with explanations of terms used in the language, 
news of recent changes to the application, instructions on the use of 
the language, the ability to change the initiative of the dialogue, or 
any other facilities which the language designer deems appropriate. 
For himself, the language designer may include statements which 
provide dumps or activate traces or timings within his programs. 

Through the semantic facilities provided by NDS, a language 
designer is capable of detecting syntactic or semantic errors in a 
user’s input, informing him of the error, and then allowing him to 
correct just that part of the text in question. Using this approach a 
typical interaction might be: 


User: PRINT STORE NAME, EARMINGS: WHEN 
DEPRECIATION > 40%: GO: 
Computer : ‘EARMINGS’ IS A MISPELLED NAME, REENTER. 
User: EARNINGS 
Computer: DEPRECIATION CANNOT EXCEED 25%, 
REENTER. 
User : 20% 


Note that the user need correct only that part of the text which is 
incorrect. 

For certain applications or for certain user experience levels, com- 
puter-initiated dialogue is a meaningful style of man-machine inter- 
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action. A language designer may implement this style of dialogue 
using the semantic facilities of NDS. The user must initiate the change 
in the style of the dialogue through a statement in the language. 
From that point in time, the machine may have the initiative and 
may interact with the user in a question and answer style illustrated 
by the following example: 


User: HELP: 
Computer: WHAT DO YOU WANT TO DO? (PRINT, RANK, 
PLOT) 
User: PRINT 
Computer: WHAT INFORMATION DO YOU WANT 
PRINTED? 
User: EARNINGS 
Computer: FOR WHICH STORES? 
User: BUFFALO, SYRACUSE 


Thus, a wide variety of dialogue styles is obtainable using NDS. The 
system itself is now described. 


IV. AN OVERVIEW OF NDS 


As illustrated in Fig. 1, NDS consists of two phases, a setup phase 
and an execution phase. These two phases interface with two different 
audiences, a language designer and the set of end-users of the language 
designer’s system. 

The language designer prepares a description of his language and 
dialogue style (details of which will be described later) to be presented 
to the setup phase of NDS. The setup phase translates these descrip- 
tions from a form suitable to the programmer to a form suitable to the 
execution phase of NDS. These translations are written by the setup 
phase onto a set of language analysis and dialogue monitor driving 
files for later access by the execution phase. The language designer 
also prepares a set of program modules containing programs to perform 
the tasks corresponding to the verbs and dialogue service statements 
in the language. At appropriate times during the dialogue, the execu- 
tion phase of NDS will pass control to these program modules to 
perform the appropriate tasks. 

Users communicate with the system through the execution phase of 
NDS. The execution phase consists of a dialogue monitor, a language 
analysis module, and a set of “built-in” dialogue service functions 
which are accessible to all languages. The dialogue monitor accepts 
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Fig. 1—An overview of NDS. 


input from the user of the system in the form of a series of statements, 
each beginning with a keyword and ending with the character colon 
(:). The dialogue monitor breaks out the clause of each statement and 
passes it to the language analysis module together with the clause 
description as specified by the language designer. The Janguage analysis 
module attempts to parse and translate the clause into an internal 
text of integers as specified by the clause description. The algorithm 
employed by the language analysis module is an extension of the 
top-down left-to-right algorithm given by Cheatham and Sattley.® 
Successful translations returned by the language analysis module to 
the dialogue monitor are placed in translation space for later access 
and a record is kept regarding which statements are currently active 
in the dialogue. 

The GO statement is used to indicate that the user’s message to 
the system is complete. When it is encountered, NDS makes a series 
of checks, called GO-analysis, which insure that any interstatement 
relationships declared by the language designer have been fulfilled. 
There are really two kinds of GO-analysis. One occurs when the current 
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verb is different from the last verb which was successfully executed. 
The system checks that a current verb exists, that all required modi- 
fiers of the current verb are active, and that no modifier (required or 
optional) of the current verb is inactive because it was incorrectly 
given since the last time the GO statement was encountered. The other 
type of GO-analysis occurs when the current verb is the same as the 
last verb which was successfully executed. The system makes the 
same checks described above, but also checks that at least one modifier 
of the current verb (required or optional) has been correctly given since 
the verb was last executed. If other interstatement relationships exist, 
facilities are provided for the language designer to specify additional 
checks to be executed as part of GO-analysis. If GO-analysis is success- 
ful, NDS passes control to the program module corresponding to the 
current active verb. When execution of the module is complete, control 
is returned to the dialogue monitor and the dialogue with the user is 
resumed. 

When the dialogue monitor recognizes a dialogue service statement 
in a user’s message, control is passed to the appropriate program 
module immediately. When execution of the module is complete, 
control is returned to NDS and the dialogue continues. 

NDS provides a set of “built-in” dialogue service statements to 
provide services common to all languages. These include: 


STOP disconnect the user from the system 

RETURN return control to the host operating system 

DELETE remove rather than replace a currently active 
statement 

CLEAR remove all currently active statements 

RECAP print out to the user all currently active 
statements 

DETAIL cause automatic recapping of the current 


message when the GO statement is recognized 
VOCABULARY print out to the user a list of keywords and 
their synonyms for all statements in the 


language 

INPUT direct NDS to take its input from a previously 
prepared character sequential file 

DUMP print out the translation of a currently active 


statement or all currently active statements 


The language designer may redefine any of these “built-in” dialogue 
service statements by including in his language a dialogue service 
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statement of the same name and providing a corresponding program 
to perform the function he desires. The corresponding “built-in” 
statement is then not accessible to users of the language. 


V. SYSTEM SPECIFICATION 


In order to create a system using NDS, the language designer must 
supply a dialogue description and a language description to the NDS 
setup phase and prepare a set of programs to be called by the NDS 
dialogue monitor to perform whatever tasks may be requested by his 
users. The general forms of these specifications are now described. 


5.1 Dialogue Description 


The dialogue description of a language written using NDS consists 
of a descriptor for each statement in the language. Each descriptor 
consists of a series of attributes which are to be assigned to the state- 
ment. These attributes define certain properties of the statement and 
may define certain relationships between the statement being described 
and other statements in the language. In general, the attributes of a 
statement are the statement identifier, statement keyword(s), clause 
syntactic type, translation allocator, verb indicator, required and 
optional modifier specifications, additional GO-analysis checks, dia- 
logue service indicator, program control information, and Polish 
indicator. . 

The statement identifier is a unique positive integer which can be 
thought of as the internal identifier of the statement. This number is 
used as a key to locate, store, and interrogate the translation of the ~ 
statement in translation space. The statement keyword, a unique 
character string containing no blank characters, is the external identi- 
fier of the statement. NDS allows a statement to have an arbitrary 
number of keyword synonyms which again must be unique for the 
entire language. 

If a statement in a language is to have a clause following its keyword, 
then the language designer must specify a clause syntactic type as 
part of the descriptor for that statement. The clause syntactic type 
is the link between the dialogue descriptor of the statement and that 
part of the language description which defines the syntax and semantics 
of its clause. A statement having a clause syntactic type as an attribute 
must also have a translation allocator. The translation allocator is 
used to specify the length of the largest possible translation of the 
clause of the statement. If the user inputs a statement whose trans- 
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lation size exceeds that indicated by its translation allocator, the 
dialogue monitor will output a standard message indicating that the 
statement is too long. 

If a statement in the language is to be recognized as a verb, then 
a verb indicator must be specified as one of the attributes of the 
statement. A verb statement may require other statements to be 
present in the current message when the user types the GO statement. 
These are called required modifiers. If a verb requires other statements 
to be present, the language designer specifies these statements accord- 
ing to their respective statement identifiers as part of the verb’s 
descriptor. Statements which are not required in the same message 
with a verb, but which somehow change the meaning or action of the 
verb, and may therefore be thought of as optional modifiers, are 
specified in an identical fashion. If other, more complex relationships 
are to exist between a verb and other statements in the language, 
facilities are available for the language designer to include, in the 
descriptor of the verb, checks of these relationships to be performed as 
part of GO-analysis. 

If a statement is to be recognized as a dialogue service statement, a 
dialogue service indicator must be part of its descriptor. A dialogue 
service statement may have a clause, in which case a clause syntactic 
type and translation allocator must be given as part of its descriptor. 
For both verbs and dialogue service statements, program control 
information must be specified as an attribute in the descriptor of the 
statement. This information identifies the program module which 
contains the program corresponding to the verb or dialogue service 
statement being described. 

The Polish indicator is used to specify that the clause of a statement 
consists of a function containing operators and operands and that the 
language designer has followed certain rules in defining the clause 
syntactic type of the statement. When a successful translation of a 
statement with the Polish indicator is returned to the dialogue monitor, 
it will be converted to early-operator Polish postfix notation’ before 
being stored in translation space. 


5.2 Language Description 


The clause descriptions for the statements in a language are written 
in a language specification meta-language. This meta-language is 
really a combination of two distinct languages: a descriptive language 
which is used to describe the syntax or structure of a clause and, 
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embedded in it, a procedural language, called the Natural Dialogue 
Semantic Programming Language (NDSPL), which is used to specify 
context-dependent syntax checks, modifications to the normal clause 
translations, diagnostic messages, error correction methods, and 
changes in the initiative of the dialogue. 

In the descriptive meta-language, a syntactic type is indicated by 
a name surrounded by ( and ). A syntactic-type definition consists 
of a syntactic type followed by an equal sign (read as “‘is defined as’’) 
followed by a sequence of language specification elements which define 
the syntactic type. The language specification elements are members 
of the following set: 


(+++) syntactic type 
| exclusive or (alternation indicator) 
& and (conjunction indicator) used to indicate that a 


portion of a clause is to be parsed and translated in more 
than one way 
[---](¢7,7) arepeating group of specification elements to be repeated 
at least 7 times but not more than 7 times, 7 = 7 2 0, 
7 > 0,2 = 7 = 1 if the parenthesized pair is omitted 
heael a semantic procedure type consisting of one or more 
NDSPL statements enclosed in primes 
--”(t, p) a non-null terminal character string enclosed in quotes, 
called a literal, with its translation number ¢ and, if the 
literal is to act as an operator, its precedence p, null 
translation if the parenthesized pair is omitted 


wn 
~ 


S any member of the terminal class string, consisting of all 
non-null character strings 
N any member of the terminal class number, consisting of 


all numbers, signed or unsigned, with or without a 
decimal point 
V any member of a language-designer-defined terminal class 


The complete specification of a language consists of a syntactic- 
type definition for each of the clause syntactic types associated with a 
statement in the dialogue description part of the system description. 
This specification provides instructions to the language analysis 
module of the dialogue monitor to parse and translate user input 
statements. Once the dialogue monitor has recognized a keyword in a 
user’s input statement, it passes the clause following that keyword 
along with the clause syntactic type associated with the statement to 
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the language analysis module for parsing and translation. The parser 
applies the clause syntactic-type definition on the input clause from 
left to right. If the parser encounters an element representing a 
terminal class (V, N, §, or literal), it must match an initial substring 
of the remaining input clause as a member of that terminal class and 
add to the translation of the clause appropriately. If the parser 
encounters a syntactic-type definition, it must apply it left to right 
on the remaining input clause. If the parser encounters a semantic 
procedure type, it must execute it. When the parser simultaneously 
encounters the end of the clause syntactic-type definition and the 
end of the user’s input clause, the parse is successful and the completed 
translation of the clause is returned to the dialogue monitor to be 
placed in translation space. 

Of the language specification elements available to a language 
designer using NDS, two deserve more detailed discussion: the 
terminal class V and the semantic procedure type. The terminal class 
V consists of a set of character strings defined by the language designer. 
Each member of the class is assigned a set of integer attributes. The 
occurrence of a V in a syntactic-type definition instructs the parser 
to match an initial substring of the remaining input clause with a 
member of this set of character strings, to append the value of its 
first attribute to the translation, and to make the values of its other 
attributes available for examination by succeeding semantic procedure 
types. The use of the terminal class V allows the language designer to 
specify the skeleton of a language where certain terminal class members 
must be chosen from a particular set. The composition of this set may 
then be changed without affecting the language specification. 

The semantic procedure type is the means by which the procedural 
part of the meta-language is embedded within the descriptive part. 
It consists of one or more NDSPL statements surrounded by primes 
and can succeed or fail just as all other language specification elements 
can succeed or fail. The statements available in NDSPL are the 
following: 


SET arithmetic assignment statement 

IF control for conditional execution (similar to the logical 
IF statement in FORTRAN IV) 

FOR, NEXT iteration control statements 

GOTO unconditional transfer 

STASH add to the current translation 

UNSTASH remove from the current translation 
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PRINT print a message to the user 

FAIL cause unconditional failure of a semantic procedure 
type 

S&F arithmetic assignment and cause unconditional failure 
of a semantic procedure type 

P&F print a message to the user and cause unconditional 
failure of a semantic procedure type 

TEST cause conditional failure of a semantic procedure type 

T&P cause conditional printing of a message to the user 
and failure of a semantic procedure type 

READ cause a recursive call on the parser to ask the user for 


additional input and to parse it according to some 
syntactic-type definition 

CALL cause control to be passed to a language-designer- 
provided own-code semantic procedure which may 
succeed or fail 


The data which are available for manipulation in NDSPL include 
numeric constants; a set of language-designer-declared variables; the 
current translation; the attributes of the most recently matched 
members of V, N, and §; a set of variables provided by NDS which 
give a picture of the current state of the parse; and the current state 
of the dialogue. Messages printed to the user through NDSPL may 
include any of the above data plus constant character strings ; the most 
recently matched members of V, N, and §; and the character string 
which the parser most recently attempted to match as a member of 
V, N, orS. 

The semantic procedure type and the facilities of NDSPL give the 
language designer a powerful tool for creating an interactive language 
and style of dialogue which are tailored to both his end-user’s needs 
and the needs of his programs. First of all, he has the ability to do 
context-dependent syntax checks by setting flags or saving the at- 
tributes of terminal class members at one point in the parse for later 
examination to determine what course the parse has taken or should 
take. He also has the ability to add to, delete from, or modify the 
normal clause translations using arithmetic functions of the available 
data. Thus, the translations of the user’s input can be tailored to the 
needs of the application programs. The output facilities of NDSPL 
provide him with the means to supply his users with timely, relevant 
error diagnostics when errors are detected in their input statements. 
Moreover, the READ statement gives him the ability to seize the 
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initiative in the dialogue. This can be used to ask for error corrections 
in mid-parse or to change the style of dialogue from user-initiated to 
computer-initiated. 


5.3 Programs 


Except for any own-code semantic procedures needed by the 
language, the only programs which must be supplied by the language 
designer are the programs for each verb and each dialogue service 
statement in the language. When a user issues a dialogue service 
statement or executes a verb using the GO statement, the NDS 
dialogue monitor uses the program control information given in the 
dialogue description to pass control to the proper program module. 
The program has access to translation space and to a set of NDS- 
provided variables which give a picture of the current state of the 
dialogue. An application program can be as simple or as complex as is 
necessary to perform the desired task. When execution of the module is 
complete, control is returned to NDS and the dialogue with the user is 
resumed. 


5.4 A Semple Illustrative Example 


Suppose that one wishes to implement an information retrieval 
language which allows users to do scatter plots of one variable versus 
another. The values for these variables are to come from a data base 
containing data for the years 1961 to 1973. The plot process is to be 
implemented as a verb, specifying the variables to be plotted, and one 
required modifier specifying a range of years for which data values are 
to be included in the plot. The specified variables must, of course, 
have numeric values. The user will be allowed to make requests 
such as 


PLOT EMPLOYEES BY REVENUES: FOR 1965-1971: GO: 
and 
PLOT REVENUES BY EXPENSES: FOR 1961 THRU 1970: GO: 


The computer program which has been written to carry out the plot 
request requires as input the internal numeric identifiers of the two 
variables and two numbers from one to thirteen, in increasing order, 
which specify the span of years to be included. The problem is to 
design a language to translate a user’s request into the necessary 
computer program inputs insuring that a valid request has been made. 
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Suppose that the terminal class V contains, in part: 


Symbol Attributes 
Varno Type 
YEAR 1 1 
COMPANY 2 3 
EMPLOYEES 3 1 
REVENUES 4 2 
EXPENSES 5 2 


where the symbols are variable names and the attributes of a variable 
are its internal numeric identifier and its type (1 for integer, 2 for 
floating point, and 3 for character). 

The specification of one possible language for communicating with 
the plot processor is given in the appendix. This language specification 
could result in a dialogue similar to the following (user input shown 
in lower case): 


REQUEST) plot employees by revenues: for 1965-1970: go: 


The PLOT processor would produce a plot of variables 3 and 4 
at level 2 for years 5-10. 


REQUEST) plot employees by company: 


COMPANY IS A NON-NUMERIC VARIABLE ILLEGAL FOR 
PLOT 


REQUEST) plot revenues by expenses: go: 


The PLOT processor would produce a plot of variables 4 and 5 
at level 2 for years 5-10. 


REQUEST) for 1970 thru 1975: 


1975 MUST BE A YEAR BETWEEN 1961 AND 1973 
INCLUSIVE 


REQUEST) for 1968 thru 1962: go: 


The PLOT processor would produce a plot of variables 4 and 5 
at level 2 for years 2-8. 
REQUEST) 


VI. CONCLUSION 


NDS provides a means for the easy implementation of time-sharing- 
based systems which employ a keyword style of man-machine dialogue. 
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Facilities are available to specify the syntax of the keyword clauses; 
the forms of their translations; timely, relevant error diagnostics; and 
a spectrum of dialogue styles, ranging from computer-initiated dialogue 
to user-initiated dialogue. 

NDS has been used to produce a variety of application systems 
primarily in the area of interactive query languages for information 
management systems. Use of NDS substantially reduces the pro- 
gramming effort required to implement such systems; and moreover, 
the implementation may be done utilizing less sophisticated pro- 
gramming talent than would otherwise be necessary. 


APPENDIX 


ATTRIBUTES VARNO, TYPE 
SCRATCH CELLS TEMP 


NOTVAR = T “IS NOT VARIABLE NAME” 

NONNUM = V “IS A NON-NUMERIC VARIABLE ILLEGAL 
FOR PLOT” 

BADYR = T “MUST BE A YEAR BETWEEN 1961 AND 1973 
INCLUSIVE” 


(PLT) = (NUMVAR) “BY” (NUMVAR) 
* 


* IF SYMBOL NOT IN TERMINAL CLASS V, PRINT NOTVAR 

* IF SYMBOL IS A NON-NUMERIC VARIABLE, PRINT 

* NONUM 

* 

(NUMVAR) =[V | 'P&F NOTVAR’] ‘IF (TYPE = 3) P&F 

NONUM’ 

* 

* IF YEARS NOT GIVEN IN INCREASING ORDER, REVERSE 

* THEM 

* 

(FOR) = (YR) [“THRU” | “—”] (YR) 
‘IF (TRANS (1) (= TRANS (2)) GOTO X; SET TEMP 
= TRANS (1); SET TRANS (1) = TRANS (2); SET 
TRANS (2) = TEMP; X’ 

* 

* IF YEAR SPECIFIED IS NON-NUMERIC OR IF YEAR OUT 

* OF RANGE, 
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* PRINT THE ERROR MESSAGE BADYR 


* 


* IF VALID YEAR MATCHED AS N, REMOVE IT FROM 
. TRANSLATION AND 

* SUBSTITUTE YEAR-1960 WHICH IS A NUMBER BETWEEN 
. 1 AND 13 


* 
(YR) =[N | ’P&F BADYR’] ’T&P VAL) = 1960 -AND- VAL 
(= 1973, BADYR; UNSTASH 1; STASH VAL-1960’ 


STATEMENT 1: “PLOT”: “P”: VERB: REQ MOD 2: SYNTAX 
(PLT): MAX TRANS 3: PROGRAM GPLOTR: 
STATEMENT 2: “FOR”: “FP”: SYNTAX (FOR): MAX TRANS 2: 
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Information Management System: 


The Off-The-Shelf System—A Packaged 


Information Management System 


By L. E. HEINDEL and J. T. ROBERTO 
(Manuscript received October 5, 1972) 


The Off-The-Shelf System (OT'SS) is a packaged information manage- 
ment system for hierarchical data bases. OTSS provides, without com- 
puter programming, processes to enter and alter data in such a data base, 
do complex retrievals of data from the data base, and specify various 
security mechanisms to limit access to, or alteration of, a data base. OTSS 
also provides a mechanism for extending the available processes on a 
project-by-project basis. OTSS has been implemented using MASTER 
LINKS and the NATURAL DIALOGUE SYSTEM. 


I. INTRODUCTION 


The Off-The-Shelf System (OTSS) is a packaged information 
management system for hierarchical data bases. Earlier work done by 
Sinowitz! was aimed at providing information retrieval capabilities 
for a specific hierarchical data base. OTSS was designed to operate 
on any hierarchical data base regardless of its structure and regardless 
of the data fields stored in the data base. 

Retrieval processes are available to print, alter, rank, plot, dis- 
tribute, compute statistics, and perform regression analysis of data. 
These processes are specified to OTSS in a key-word English-like 
language in an interactive dialogue environment. OTSS allows for 
simple alteration of the retrieval process from request to request by 
selective replacement, deletion, or addition of statements to the 
dialogue description of the process to be performed. 

As part of the package, OTSS provides a data-base-independent 
security mechanism. This mechanism allows a data base administrator 
to restrict access or alteration of a data base and use of certain process 
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and language facilities in OTSS on a user-by-user basis. A user can be 
restricted to a logical hierarchical subsection of the data base and to 
only certain data items stored in that subsection. The user can be 
further restricted to using only certain processes of OTSS such as 
printing, ranking, or altering processes, but not plotting or distributing. 

In addition, OTSS provides the ability to extend the processing 
capabilities of the system by allowing a programmer to add new 
processes to the system on a project-by-project basis. The processes 
so installed are available to the project installing them, and do not 
become a permanent part of OTSS. 

OTSS was implemented using the MASTER LINKS? data base 
management system and the NATURAL DIALOGUE SYSTEM; a 
system for designing and implementing interactive computing lan- 
guages in a dialogue environment. 


II. SYSTEM DESIGN CRITERIA 


As a packaged information management system, OTSS was de- 
veloped to satisfy certain basic design criteria. The two primary 
design criteria are hicrarchical independence and ficld independence, 
i.e., the structure and specific content of the data base. In establishing 
these as design criteria the retrieval processors, security mechanisms, 
and associated language specifications are written to operate on any 
hierarchical data base regardless of its logical structure and regardless 
of the data types of the fields of the data base. The system provides 
these capabilities by making use of the information contained in the 
driving tables of the data base system, MASTER LINKS, used to 
implement OTSS. These driving tables describe the logical structure 
and data fields of a given data base. 

Another design criterion of OTSS was to provide the user with the 
ability to specify a generalized retrieval function in the sense of Ref. 2, 
and a comprehensive set of data base processors to operate on such 
functions. In general, a retrieval function is defined as a combination 
of data base fields, constants, and previously defined retrieval functions 
using the standard arithmetic, relational, and logical operators. 
Through a keyword-oriented language, a user of OTSS can specify 
an arbitrary retrieval function, and, for example, request the system 
to print its values, rank its values, or plot its values by the values of 
another retrieval function. The user can also specify a series of logical 
retrieval functions which are to be used to selectively delimit the search 
of the data base during the retrieval process. 


OFF-THE-SHELF SYSTEM 1745 


Directed output of any retrieval process is a fourth basic design 
criterion. Through the retrieval language, a user can direct the output 
of any retrieval process to any external device including line printer, 
card punch, magnetic tape, disk, or console. 

As a final design objective, OTSS offers the programming audience 
the ability to extend the processing capabilities of the system on a 
project-by-project basis. A programmer can write a project-dependent 
processor and “‘install’’ such a processor into the OTSS environment. 
Once installed, this processor is available only to the project installing 
it, and does not become a permanent part of OTSS. 

In the following sections we shall describe in more detail what we 
mean by a hierarchical data base, examine typical uses of retrieval 
functions, and present the definition of the load and retrieval phases 
of OTSS. 


ITI, HIERARCHICAL DATA BASES 


As discussed in greater detail in Ref. 2, a hierarchical data base is 
a directed tree which is rooted at one entity (node). At each entity in 
the tree is stored a set of fields. Two entities belong to the same group 
if the set of fields stored at one entity is identical to the set of fields 
stored at the other. Two entities belonging to the same group are at 
the same depth in the tree (i.e., connected to the root entity of the 
tree by paths of the same length). Also the ancestor groups of one 
entity belonging to a group must be identical to the ancestor groups 
of any other entity belonging to the group. 

Using the concept of groups, it is possible to represent the structure 
of a hierarchical data base as a rooted tree whose entities are the groups 
of the hierarchical data base placed in the tree analogously to the 
entities in the data base with respect to depth and connectivity. At 
each entity in the group tree are listed the fields stored at that group. 
In reference to the group tree, we say that a set of groups forms a 
chain if and only if for any G; and G; belonging to the set of groups, G; 
is an ancestor or descendant of G;. A group chain is complete if all the 
ancestors of every group on the chain are on the chain. Entity chains 
and complete entity chains are defined in an analogous way. 


3.1 Structure of a Sample Hierarchical Data Base 


As an example of the structure of a hierarchical data base, consider 
the group tree presented in Fig. 1. This hierarchical data base is 
rooted at the COMPANY group as there is only one company. We 
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shall return to this sample data base in Section IX when we discuss 
examples of using the OTSS retrieval language. 


IV. SPECIFYING THE STRUCTURE OF A DATA BASE 


OTSS is independent of the hierarchical structure of the data base 
and the particular fields stored in the data base. OTSS obtains all 
its required information about the data base from a series of driving 
tables. These driving tables are produced using the BUILD facility 
of MASTER LINKS.? 

BUILD allows the data base designer to completely specify the 
logical structure of the data base. The driving tables produced by 
BUILD are used by OTSS to determine the correctness of data loading 
and retrieval requests and by MASTER LINKS to be able to enter 
and access data in the data base. 


V. LOADING DATA INTO A DATA BASE 


Once BUILD has been used by the data base designer to specify the 
logical structure of a hierarchical data base, it is ready to have data 
loaded into it. OTSS provides a facility, called LOAD, for bulk loading 
of data into the data base. 

LOAD allows files of data to be sequentially loaded into a data 
base, i.e., LOAD does not provide for multiple concurrent updates of 
a data base. LOAD does provide a simple mechanism to restart 
a data load which was terminated abnormally due to machine failure 
or human error. 

The file of data to be loaded using LOAD is logically divided into 
sections. Each section is identified by an integer number, called the 
card type, which must appear in columns 1 through 3 of each record 
of the section. More than one section in the data file may have the 
same card type. The data within a given card type section must be 
organized according to a specific card type definition, and it must be 
organized in the same manner for every section having the same card 
type. 

The definitions of the various card types are defined once by the 
data base designer using the definition phase of LOAD. In a card-type 
definition, the designer indicates where, on the records of the specified 
section, the data values for particular fields can be found and the 
name of the entity where the data is to be loaded. Along with the 
field name, the user indicates which record of the section and which 
field (set of contiguous columns) of that record contains the value of 


OFF-THE-SHELF SYSTEM 







1747 


GROUP NAME FIELD NAME FIELD TYPE 
COMPANY COMPANY NAME STRING 
STATE NAME STRING 
CITY NAME STRING 
POPULATION INTEGER 
STORE NAME STRING 
ADDRESS STRING 
EARNINGS REAL 
DEPRECIATION REAL 
STORES COUNTER 
DEPARTMENT DEPARTMENT NUMBER INTEGER 
SALES FORCE INTEGER 
DOLLAR SALES REAL 
DEPARTMENTS COUNTER 
ITEM NUMBER INTEGER 
IN STOCK LOGICAL 
ON ORDER LOGICAL 
BACK ORDER LOGICAL 
PURCHASE COST REAL 
SELLING PRICE REAL 
REORDER DATE DATE 
TOTAL SALES REAL 
ADVERTISING REAL 
WAREHOUSE WAREHOUSE ADDRESS STRING 
WAREHOUSE ITEM | WAREHOUSE ITEM NUMBER INTEGER 
AVAILABLE UNITS INTEGER 





Fig. 1—Structure of a sample hierarchical data base. 


the field. Every section having the same card type must have the 
value for a specified field in the same field of the same record of the 
section as described in the corresponding card-type definition. 

Thus the card-type definition specifies what data are contained in 
a section of a file of input data and where the data are to be loaded into 
the data base. When LOAD processes a file of data, it looks at columns 
1 through 3 of the records to determine what card-type definition 
should be used for the section and applies the appropriate card-type 
definition to direct its data loading process. 


VI. RETRIEVALS FROM THE SAMPLE DATA BASE 


A simple form of a retrieval process on the sample data base is to 
extract the value of a single field at all its occurrences in the data base. 
For instance, one might wish to extract all the values of DOLLAR 
SALES in the data base. A more interesting case is to extract, along 
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with the values of DOLLAR SALES, the corresponding values of 
DEPARTMENT NUMBER. If the department numbers were only 
unique within a store, one might wish to also extract the corresponding 
values of STORE NAME. Notice that due to the structure of our 
sample hierarchical data base, there is only one value of STORE 
NAME for each value of DEPARTMENT NUMBER and DOLLAR 
SALES, but there are several values of DOLLAR SALES and DE- 
PARTMENT NUMBER for each value of STORE NAME. 

Suppose now that one were interested only in extracting the value 
of DOLLAR SALES for a given department within a given store. This 
is similar to the first example of a retrieval process, except that the 
retrieval process would first delimit the search of the data base to a 
subtree of the data base consisting of the particular STORE entity 
and DEPARTMENT entity. To do this there is a directory into the 
data base which is based on the name of an entity or a chain of entity 
names. Having delimited the search of the data base to the particular 
store and department, the retrieval process can extract the one value 
of DOLLAR SALES contained in the delimited data base. 

Some retrieval processes combine extraction based on known entity 
names and the values of fields stored in the data base. One might wish 
to extract the value of ITEM NUMBER for those items in one particu- 
lar store which have SELLING PRICE greater than $9.00. In this 
case, the retrieval process would first delimit the search of the data 
base to the entity in the STORE group with the appropriate name 
and to all entities which are descendants of it. The retrieval process 
would then search through all entities in the ITEM group in the 
delimited part of the data base and extract the ITEM NUMBER for 
those items having SELLING PRICE greater than $9.00. 

So far only examples of extracting the values of simple fields stored 
in the data base have been discussed. It is also possible to evaluate 
more complex retrieval functions as part of the retrieval process. A 
simple example would be to extract the value of SELLING PRICE/ 
PURCHASE COST. This retrieval process creates a new pseudo-field 
at the ITEM group which is then extracted. A more complex example 
is the summing of all the values of DOLLAR SALES within a store. 
To evaluate this function, the retrieval process would have to extract 
the value of DOLLAR SALES for every DEPARTMENT entity 
under each STORE entity and then add them together. This process 
produces a new pseudo-field at the STORE group which is then 
extracted. An operation which raises the level, in the hierarchy, of 
definition of a field or expression is called a level-raising operation. 
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Summing is not the only level-raising operation which can be per- 
formed. Others are minimum, maximum, and average on numerical 
data and any, all, and none for logical data. For instance, one might 
wish to extract the DEPARTMENT NAME of all departments that 
have their minimum ratio of SELLING PRICE to PURCHASE COST 
less than 1. One might also wish to extract the DEPARTMENT 
NAMES as above, but only including in the level-raising operation 
items which have a selling price greater than $9.00. 

We have now seen several examples of retrieval processes. All these 
retrieval processes are examples of a complete retrieval process which 
delimits the search of a data base by entity names, accepts or rejects 
entities by logical conditions, and evaluates complex retrieval functions 
including level-raising. We have only referred to extracting data from 
a data base and have not said anything about what should be done 
with the data once extracted. This was done intentionally to divorce 
the retrieval process from the displaying of the extracted data. 

The retrieval language provides processes to print, alter, rank, plot, 
distribute, compute statistics, and perform regression analysis of 
extracted retrieval functions by using the appropriate keyword. 


VII. SPECIFICATION OF RETRIEVAL FUNCTIONS 


Retrieval functions are specified by combining data base fields, 
constants, and previously defined retrieval functions using the standard 
arithmetic, relational, and logical operators. The arithmetic operators 
defined on numeric data and their symbols are: addition (+), sub- 
traction (—), multiplication (*), division (/), and exponentiation (7). 
The relational binary operators defined for numeric data and date 
data and their symbols are: equal to (=), not equal to (— =), greater 
than (>), greater than or equal to (>=), less than (<), and less 
than or equal to (< =). Relational binary operators defined for string 
data and their symbols are: equal to (=) and not equal to (— =). 
Logical binary operators defined for logical data and their symbols 
are: logical and (AND) and logical or (OR). 

The unary operators available in the OTSS retrieval language are 
of two types: those which operate on a single value and those which 
operate on a set of values. Unary operators operating on a single value 
are: unary plus (+), unary minus (—), logarithm to the base 10 
(LOG10), logarithm to the base e (LOGE), e raised to a power 
(EXPF), absolute value (ABSF), sine (SINF), and cosine (COSF) 
for numeric data and not (NOT) for logical data. 

Unary operators which work on a set of values are the level-raising 
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operators. Level-raising operators are of the following two forms: 


lr field PER gn 
GLOBAL lr field PER gn, 


or 


where Ir is any level-raising operator and gn is any group name. The 
field must be defined at a group which is a descendant of the group 
following the PER. The set of values that the level-raising operator 
operates on arc the values of the field in entities which are descendants 
of an entity in the group following PER. The level-raising operator 
takes the set of values and computes a single value which depends 
on the level-raising operator. The level-raising operators for numeric 
values are: sum (SUM), minimum (MIN), maximum (MAX), and 
average (AVG). The level-raising operators for logical values are: 
logical any (ANY), logical all (ALL), and logical none (NO). 

If the level-raising operator is not preceded by the literal GLOBAL, 
any entity restrictions that have been applied to all groups between 
the group following PER down to and including the group of the 
field are evaluated and entities and their descendants are rejected for 
which the entity restriction is not satisfied. An entity restriction is a 
retrieval function whose type is logical. An entity in the group of the 
retrieval function is accepted or rejected if the logical function evalu- 
ates to TRUE or FALSE respectively. The remaining set of entities 
at the group of the field are then combined using the level-raising 
operator. If the literal GLOBAL is present, all entity restrictions 
below the group following PER are ignored. 

Unary operators can be nested without parentheses and are evalu- 
ated from right to left. Parentheses can be used to cause evaluation 
of a retrieval function to occur in other than the normal order of 
evaluation. 

The groups of all fields in a retrieval function must form a group 
chain. A retrieval function has a definition group associated with it. 
The definition group of a retrieval function is the group of maximum 
depth of the groups of fields not operated upon by a level-raising 
operator and the groups given in level-raising operators. In this manner, 
retrieval functions are made to be single-valued for each entity in the 
definition group. 


VIII. THE RETRIEVAL PROCESS 


The retrieval process used by OTSS can be described by considering 
the steps involved to evaluate any one retrieval function for all 
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entities at which it is defined in a subset of the data base. To describe 
the retrieval process, let us assume we wish to evaluate a retrieval 
function, f, defined at some group, G, at every entity of G in a subset 
of the data base. The steps of the retrieval process are as follows: 


Step 1: 
(a) 


Step 2: 


(a) 
(b) 


(c) 


Step 3: 
(a 


A 


(b 


A 


(d) 


Entity Selection Based on Entity Chains 


Delimit the search of the data base by constructing an 
access tree based on any specified entity chains. 


Entity Selection Based on Entity Restrictions 


Start at the root of the access tree constructed in Step 1. 
Select the next entity in a depth-first, left-to-right manner. 
If all entities have been selected, the retrieval process is 
finished. 

If an entity restriction has been placed on entities in the 
group of the entity obtained by Step 2b, apply it. If the 
result is ‘‘reject,” reject the entity and all its descendants 
and go back to Step 2b. If f is defined at the group of the 
entity, evaluate it using Step 3. Upon completion of the 
evaluation or if f is not defined at the entity, go to Step 2b. 


Function Evaluation 


If f is a field, retrieve its value; or if f is a constant, use its 
value. 

If f is a level-raised field without the GLOBAL prefix, 
apply all entity restrictions to entities in all groups from G 
down to and including the group of the level-raised field 
and ignore all entities and their descendants for which the 
entity restriction is not satisfied. Perform Step 3 on each 
remaining entity of the group of the field and combine the 
results according to the appropriate level-raising operator. 
If f is a level-raised field with the GLOBAL prefix, perform 
Step 3 on all descendant entities at the group of the field 
and combine the results according to the appropriate level- 
raising operator. 

Combine values obtained in Steps 3a, 3b, and 8c using 
appropriate operators. 


The above described retrieval process can be expanded to evaluate 
several different retrieval functions during one pass through the data 
base. It can be seen that all the retrieval processes in Section VI can 
be formulated in terms of the general retrieval process given above. 
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Hence the user of OTSS need only learn how to specify the retrieval 
process and the form of output desired. Detailed algorithms for 
implementing the retrieval process are described in Appendix A. 


IX. LANGUAGE EXAMPLES 


Having presented the descriptions of retrieval functions and the 
retrieval process, let us proceed to examine several examples of the 
OTSS retrieval language. The OTSS retrieval language is a keyword- 
oriented, English-like language which provides the necessary input 
to the retrieval process and means of specifying the output format. The 
language contains FOR and IN statements which are used to specify 
subtree delimiting of the search of the data base; WHEN statements 
for specifying logical entity restrictions ; LET statements for specifying 
retrieval functions; and various output specification statements such 
as PRINT, RANK, PLOT, etc. A complete description of these and 
other auxiliary statements are described in Appendix B. 

To begin our examples, let us print the names of all the departments 
in the store at 19 Fifth Ave., New York City, New York. The following 
statements accomplish this: 


PRINT DEPARTMENT NAME: FOR 19 FIFTH AVE, NEW 
YORK CITY, NEW YORK: GO: 


Now to print only those departments with dollar sales greater than 
$5000 we need only enter the following statements: 


WHEN DEPARTMENT HAS DOLLAR SALES > 5000: GO: 


as OTSS remembers the last occurrence of each keyword statement 
entered. 

To print the department which has the highest dollar sales in the 
store at 19 Fifth Ave., we would enter the statements: 


WHEN DEPARTMENT HAS DOLLAR SALES = GLOBAL 
MAX DOLLAR SALES PER STORE: GO: 


Suppose we now wished to print the ratio of dollar sales of those 
departments having dollar sales less than $5000 to the dollar sales of 
the entire store. We would enter the statements: 


PRINT DOLLAR SALES PER STORE/GLOBAL DOLLAR 
SALES PER STORE: WHEN DEPARTMENT HAS DOLLAR 
SALES < 5000: GO: 


If we wished to do the above request for all stores, not just the one 
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at 19 Fifth Ave., we would enter: 
DELETE FOR: GO: 


And suppose finally we wanted to do the same request for only those 
stores whose dollar sales are greater than $1,000,000 but less than 
$5,000,000. We would enter: 


WHEN STORE HAS GLOBAL DOLLAR SALES PER STORE 
> 1000000 AND GLOBAL DOLLAR SALES PER STORE 
< 5000000: GO: 


To save a small amount of typing, we could have entered the following: 


LET X = GLOBAL DOLLAR SALES PER STORE: WHEN 
X > 1000000 AND X < 5000000: GO: 


X. REPORT FACILITY 


The REPORT statement is the general interface to extend the 
processes available in the OTSS retrieval language on a project-by- 
project basis. One can write a process in FORTRAN which can be 
installed into OTSS to be referenced by some report name using the 
REPORT statement. The syntax of the REPORT statement is the 
keyword ‘“REPORT” followed by a report name optionally followed 
by a list of retrieval functions separated by commas. If the process 
requires retrieval functions to be passed into it as parameters, they 
follow the report name in a manner analogous to the retrieval function 
list-in the PRINT statement. The process so installed is available to 
the project installing it, and does not become a permanent part of 
OTSS. 


XI. SYSTEM SECURITY 


The SECURITY statement is a command in the language which 
enables a data base administrator to define, interrogate, and remove 
security information for his user audience. The types of security which 
are available to a data base administrator are environmental, language, 
and data base. These security mechanisms may be used by the admini- 
strator to restrict access or alteration of his data base and use of 
facilities in the retrieval language on a user-by-user basis. 


11.1 Environmental Security 


The first type of security which may be specified is the environmental 
security. Environmental security is used by the administrator to 
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specify legal users of the system. This security mechanism uses two 
pieces of information: a sign-on key passed into the system and 
password information input by the user. The data base administrator 
specifies valid combinations of sign-on keys and optional password 
information for each potential user of the system. When a user initially 
enters the retrieval environment, the system will interrogate the sign-on 
key typed in by the user to see if it is valid. If the sign-on key is not 
in the list of valid sign-on keys provided for by the administrator, the 
system will terminate the session. A valid sign-on key will cause the 
system to prompt the user for the password information (if this 
sign-on key requires a password). If the password is incorrect, the 
session is terminated. 


11.2 Language Security 


The data base administrator has the ability to limit use of certain 
facilities in the retrieval language. This type of security is called 
language security. In order to specify language security the admini- 
strator defines one or more statement restriction classes. A statement 
restriction class is a set of statements in the retrieval language which 
is not available to a set of users of a data base. The administrator would 
then indicate, on a user-by-user basis, which statement restriction 
class pertains to each user. If a user attempts to use a statement in the 
retrieval language which is a member of his statement restriction class, 
the system will output a message indicating that the statement in 
question is not available for use by him. 


11.8 Data Base Security 


The final type of security provided for by the system is data base 
security. Data base security can be subdivided into two parts: field 
security and access tree security. Field security is specified in a manner 
similar to the language security specification. The data base admini- 
strator defines one or more field restriction classes. A field restriction 
class is a set of fields in the data base which is not accessible to one or 
more users of the system. A field restriction class may be restricted 
on a read/write basis or on a write basis only. After defining the sets 
of field restriction classes, the administrator indicates, on a user-by- 
user basis, which restriction class pertains to each user. If a user 
attempts to retrieve or modify the value of a field which is a member 
of his field restriction class, the system will output a message indicating 
that the field in question is not available for access or alteration. 
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Access tree security is used to restrict users to a logical hierarchical 
subsection of a data base. For each potential user of his system, the 
data base administrator may specify a corresponding USER statement. 
The USER statement has the same form as the FOR and IN state- 
ments described in Appendix B, and is used to delimit the search of 
the data base. 

When a user initially enters the retrieval environment, after passing 
the environmental security phase, the USER statement corresponding 
to that user will be processed to delimit the search of the data base. 
If the user tries to access a portion of the data base outside of the 
logical hierarchical subsection specified in the USER statement, the 
system will output a message indicating this as an illegal action. Note 
that the USER statement cannot be modified or deleted by a user. 


XII. CONCLUSION 


OTSS is an information management system designed to be in- 
dependent of the structure or content of any specific hierarchical data 
base. OTSS provides a simple keyword-oriented, English-like language 
for specifying the retrieval of values of complex retrieval functions 
and the alteration of data in a data base. In addition, OTSS provides 
a means of loading data into a data base and specifying various forms 
of security on the data base and the use of statements in the language. 


APPENDIX A 


The retrieval process (RP) has as its inputs a set of complete access 
lists, a set of retrieval functions, and a set of entity restriction functions. 
These inputs completely specify the semantics of the retrieval process. 

RP applies the algorithm TB (Tree Building) to construct a subtree 
of the data base over which the retrieval process will be performed. 
RP applies the algorithm GTP (Group Tree Pruning) to make up a 
list, R, whose entry for each group, g, is “referenced” if g is the defini- 
tion group of a retrieval function or an ancestor of the definition group 
of a retrieval function. 

RP uses the algorithm ACTION to create a list, A, of selector 
directions for each group in the hierarchy. The entry in A for group, g, 
is “down” if it is an ancestor of the definition group of a retrieval 
function and is “right” otherwise. 

After having applied TB, GTP, and ACTION, RP proceeds to 
select the entities of the subtree (using the algorithm GEN) in a left- 
to-right, depth-first manner. The algorithm GEN only returns entities 
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to RP for which a retrieval function may have to be evaluated or for 
which an entity restriction function may need to be evaluated and 
which is the group of a retrieval function or an ancestor of a group of a 
retrieval function. Hence, in this manner, no extraneous entities are 
selected. 

Whenever an entity is returned to RP by GEN, RP determines if 
there is an entity restriction function to be evaluated at this entity. 
If there is, it is evaluated. If the entity restriction function evaluates 
to false, the action input to GEN is set to “right”? and GEN is applied 
to select the next entity. 

Should there be no entity restriction function to be evaluated, or 
should it evaluate to true, RP examines the list of retrieval functions 
to be evaluated and evaluates those defined at the group of the current 
entity. RP then iterates the whole procedure until all the entities on 
the subtree have been generated. 

More formally the following algorithms define the functions of RP, 
TB, GTP, ACTION, and GEN. 


Retrieval Process (RP) 


Input: Complete entity access lists: e1, e2, ---, en. 
Retrieval functions: ff, ff, ---, £8. 
Entity restriction functions: bf, b¥, ---, b&. 
Output: Values of retrieval functions: ff, ff, ---, f&. 


Step 1: T — TB(ei, e2, ---, en) which builds a tree, T, the subtree 
of the data base that the retrieval process is to be applied to. 

Step 2: Re GTP (f#, f%, ---, f&) which builds a list, R, containing 
one entry for each group in the hierarchy. The entry for a 
group is marked “referenced” if the group is one of the g or 
one of its ancestors and is marked ‘‘unreferenced”’ otherwise. 

Step 3: A<— ACTION (f?, f§, ---, f%) which builds a list, A, con- 
taining one entry for each group in the hierarchy. The entry 
for a group is marked “down” if the tree traversal action to 
be performed is ‘‘down’’ and is “right’”’ otherwise. 

Step 4: CE < root entity of the retrieval tree, T. 

Step 5: Examine the list of entity restriction functions to see if any 
bf‘ is defined at the group of the current entity, CE. If none, 
go to Step 7. 

Step 6: Evaluate bf. If the value is FALSE, go to Step 11. 

Step 7: Any more ff to be evaluated? If none, go to Step 9. 


Step 8: 
Step 9: 
Step 10: 


Step 11: 
Step 12: 
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Evaluate ff, output result and go to Step 7. 

If there are no more entities on T to be generated, then exit. 
CE — GEN(T, R, A(gi), CE) which generates the next 
entity on T, then go to Step 5. 

If there are no more entities on T to be generated, then exit. 
CE <— GEN(T, R, “‘Tight,’’ CE), then go to Step 5. 


Tree Building (TB) 


Input: 


Output: 


Step 
Step 
Step 


Step 
Step 
Step 
Step 
Step 


: 
2 
3: 


ON Daw 


Complete entity access lists: e1, e2, - ++, en. 
The retrieval tree, T,. 


Ti <— null tree. 

If there are no more ei, go to Step 5. 

Construct tree, Ts, consisting of the entities on the access 
list, Qj. 


: Ti — T; union Ts, go to Step 2. 

: Make T2 a copy of Ti. 

: If there are no more unexamined entities on T», exit. 

: e <— next unexamined entity on T». 

: Examine each group which is a descendent of the group of e 


to seo if there exists an entity on Tz which is a descendent 
of e. For each group in which this is not true, put all entities 
in the data base on T, which are descendents of e. Go to 
Step 6. 


Group Tree Pruning (GTP) 


Input: 


Output : 


Step 


Step 
Step 
Step 
Step 
Step 
Step 


“Io OF BP GW bo 


Retrieval functions: ff, ff, ---, f&. 
A list, R, containing one entry for each group in the hier- 
archy. The entry for a group is marked ‘‘referenced”’ if the 
group is one of the g; or an ancestor of one of the gi and is 
marked ‘‘unreferenced”’ otherwise. 


: Initialize the entries in R for each group in the hierarchy to 


“unreferenced.”’ 


: If there are no more ff‘, exit. 

> 2 gi 

: If the entry in R for group g is “referenced,” go to Step 2. 
: Set the entry in R for group g to “referenced.” 

: If g is the root group, go to Step 2. 

: g «father of g; go to Step 4. 
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Generating Action (ACTION) 


Input: 


Output: 


Step 


Step 
Step 
Step 
Step 
Step 
Step 
Step 


_— 


CON OS or WwW bo 


Retrieval function: ff, f§, ---, f&. 

A list, A, containing one entry for each group in the hierarchy. 
The entry for a group is marked “down” if the tree traversal 
is down, and “right’’ otherwise. 


: Initialize the entries in A for each group in the hierarchy to 


“right.” 


: If there are no more ff‘, then exit. 

1g gi 

: If the entry in A for group g is ‘“‘down,” go to Step 2. 
: If g is the root group, go to Step 2. 

: g <— father of g. 

: If the entry in A for group g is “down,” go to Step 2. 
: Set entry in A for group g to “down,” go to Step 5. 


Tree Generation (GEN) 


Input: 


Output : 


Step 
Step 


Step 
Step 
Step 
Step 


Step 


Step 


1: 
yee 


Oo ork WwW 


Vee 


8: 


Retrieval tree, T; the list R of referenced and unreferenced 
groups; and the action, A, either “‘right’”’ or ‘‘down,”’ and the 
current entity, CE. 

CE the next entity to be processed by the retrieval process. 
If the action, A, is “right,’”’ go to Step 4. 

Find the leftmost entity on T, LME, which is a descendent 
of CE and for which the entry for the group of CE in list R 
is marked “referenced.” If there are none, go to Step 4. 


: CE — LME, then exit. 

: If CE has no brother entity to the right, go to Step 6. 

: CE — next brother of CE to the right, then exit. 

: Find the leftmost group, g, which is on the same level as the 


group of CE for which the entry in the list R is marked 
“referenced” and for which there exists an entity on T which 
has not previously been processed. If none exists, go to 
Step 8. 

CE < leftmost entity of group, g, which has not yet been 
generated; then exit. 

CE < father of CE; go to Step 4. 


APPENDIX B 


OTSS Retrieval Language Statements 


To make the description of the OTSS retrieval language statements 
more readable, the following notations are used: 
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(a) Capitals and special symbols are literals in the language. 

(b) Lower case include: 
f | —any retrieval function 
str —any non-null string of alphanumeric characters 
num—any number 
gn -—the name of a group 
null —a null character string 
stmt—the name of a statement in the retrieval language. 

(c) Square brackets imply that the constructs within the brackets 
are alternatives starting from the top line down. One item from 
the vertical list of alternatives must be selected. 


LET str =f: 


The LET statement is used to create additional fields which are not 
stored in the data base. The field so created may be used in any other 
statement or retrieval function. 


Be 


IN e-list:; e-liste; --- e-list,: 


The notation, e-list, indicates a list of entity names separated by 
commas. Each e-list represents an entity chain. The combination of 
all entity chains specifies an access tree. This access tree is used to 
select the subset of the data base over which the retrieval search will 
take place. 


WHEN ie acl f: 


null 


The WHEN statement specifies an entity restriction function 
defined at the group, gn, which delimits the search of the data base 
during the retrieval process. If a WHEN condition is defined for a 
group, retrieval will take place from a entity within that group only if 
the entity restriction function evaluates to TRUE. If the entity 
restriction evaluates to FALSE, the entity (and all its descendents) 
will be ignored during the retrieval process. 


PRINT fi, fz, «++ fn: 


The PRINT statement specifies a tabular printout of the values of 
the individual retrieval function. 


DISTRIBUTE f, BY fo: 
The DISTRIBUTE statement specifies a tabular histogram with f, 
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as the ordinate and f2 as the abscissa. 


TO IN STEPS OF num; 
BETWEEN num, | ia num: | ISO num; : 
null 


The BETWEEN statement specifies the range and cell intervals to 
be used by the DISTRIBUTE process. 


CUMULATIVELY: 


The CUMULATIVELY statement may be used with the DIS- 
TRIBUTE process to alter the distribution to produce cumulative 
values in each of the defined cells. 


CHART: 


The CHART statement may be used with the DISTRIBUTE 
process to specify bar chart output. 


RANK f AT gn: 


The RANK statement specifies to rank in descending order (largest 
to smallest) the individual values of f within each entity of the group 
gn, and displays the results in tabular form. 


INVERSELY: 


The INVERSELY statement specifies to the RANK process to 
invert the order of the RANK output. 


LARGEST 
THE] | HIGHEST . 
KEEPING | nil SMALLEST | 2U™: 
LOWEST 


The KEEPING statement is used to specify to the RANK process 
to rank the ‘‘num”’ largest or smallest values of the rank function at 
each entity in the rank group. 


CARRYING ALONG f, fo, «+ -fn: 


The CARRYING statement may be used to specify to the RANK 
process to “carry along” the values of other retrieval functions and 
have them displayed as part of the RANK output. 


PLOT fi, fo, wey Tina BY 133: 
The PLOT statement specifies an X-Y point plot with the values of 
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f, through f,_1 as the ordinate and f, as the abscissa. 


ee arom um ie num: 
Y-AXIS area Oe 0 faa 
null 
The X-AXIS and Y-AXIS statements must be used to specify to 


the PLOT process to specify the origins and ranges of the independent 
and dependent fields. 


STATISTICS fi, fo, «+ -fn: 


The STATISTICS statement requests a set of standard statistics 
to be produced for cach function and the results printed in tabular 
form. 

REGRESS f; BY fo, fs, ---f,: 


The REGRESS statement specifies to perform a multiple linear 
regression analysis of the function f, (dependent field) by the functions 
f. through f,, (independent fields). 


ALTER field TO f: 


The ALTER statement is used to permanently change the value of 
the field to the value of f. 
BRIEF 
INTERACTION | DETAIL 
VERIFY 


The INTERACTION statement specifies to the ALTER process a 
level of verification required by the user when altering a datum value. 





REPORT str E fh, fa, = 


null 


The REPORT statement specifies to pass control to an application 
dependent process identified by the string, str. 


TITLE str;! stro !---! str,: 


The TITLE statement specifies to any process to print lines of text 
centered at the top of the output of the process. 


PLACES num: 


The PLACES statement is used to control the number of decimal 
places displayed for real-valued functions. 


OUTPUT TO str: 
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The OUTPUT statement specifies to any process to direct its output 
to a specific output device. 


WHEN FOR gn, gno, ---gnn 
DELETE | stmt 
ALL 


The DELETE statement specifies to remove a statement or set of 
statements which are currently active. 


DEFINE van, vare, --+Vara: 


where var; through var, are the names of created fields previously 
defined through the use of the LET statement. The DEFINE state- 
ment will permanently save the name and definition of each of the 
created fields mentioned in the DEFINE list. 


UNDEFINE fieldi, fields, - - - field, : 


where field; through field, are the names of fields which were perma- 
nently created through the use of the DEFINE command. The UN- 
DEFINE statement specifies to remove the names of the permanently 
created fields from the list of all possible fields accessible through the 
retrieval language. 

DETAIL: 


The DETAIL statement specifies to automatically recap the current 
state of the dialogue when a process is executed. 


RECAP: 


The RECAP statement specifies to display the current state of the 
dialogue. 
INPUT FROM str: 


The INPUT statement causes OTSS to accept input from a pre- 
viously prepared file identified by str. 
sail t hae ale 


eee aaa null 


When the SAVE statement is given by the user, the system writes 
the current state of the dialogue on to the file identified by str. 
DATABASE str: 


where str is the name of another data base. The DATABASE state- 
ment allows the user to switch from one data base to another from 


OFF-THE-SHELF SYSTEM 1763 


within OTSS. 
ERASE str: 


The ERASE statement causes the disk file identified by str to be 
erased. 


VOCABULARY: 


The VOCABULARY statement specifies to print out the entire list 
of keywords and their associated synonyms available in the OTSS 
retrieval language. 

RETURN: 


The RETURN statement is used to return control to the operating 
system level. 
STOP: 


The STOP statement will disconnect the user from the time-sharing 
system. 


CREATE aur 
null 
ALL 
SECURITY | INTERROGATE | 
user, usere, > -Uusern 
REMOVE ee 
user, usere, ***USern 


The SECURITY statement is used by a data base administrator to 
define, interrogate, and remove security information for his user 
audience. 

GO: 


The GO statement causes the last-mentioned process to be executed. 
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The Potential in a Charge-Coupled Device 
With No Mobile Minority Carriers 


By J. McKENNA and N. L. SCHRYER 
(Manuscript received May 17, 1973) 


The potentials and fields in a two-dimensional model of a charge- 
coupled device (CCD) are studied. We assume no mobile minority car- 
riers have been injected into the CCD and that the electrode voltages do 
not vary with time. The nonlinear equations describing the devices are 
first linearized using the depletion layer approximation. The linearized 
equations are then solved approximately by a fittung technique. Both 
surface and buried channel CC'D’s are considered. The accuracy and cost 
of obtaining the solution is discussed. This work is a continuation of a 
study initiated in an earlier paper. 


I. INTRODUCTION AND SUMMARY 


In this paper we study the electrostatic potential distribution and 
fields in a two-dimensional model of a charge-coupled device?? (CCD). 
This work is a continuation of a study initiated in an earlier paper! 
hereafter referred to as I. In I we considered a static, two-dimensional 
model with no mobile charge, and with electrodes so close together 
that they could be assumed to touch. We showed there that the deple- 
tion layer approximation‘ could be used to linearize the potential 
equations, and the linearized equations were then solved analytically. 
The numerical evaluation of these solutions was shown to be very 
accurate and cheap. 

We extend the model of I to allow for gaps between the plates. 
Our purpose here is twofold. We want to examine the dependence 
of the potentials and fields in a CCD on various design parameters. 
As we show, our model allows considerable flexibility in describing 
various electrode configurations. In addition, however, we want to 
demonstrate a method of numerically solving the potential equations 
which we believe is of considerable interest in itself. 

Both surface? and buried channel®:§ CCD’s are considered. However, 
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as in I, only the analysis for buried channel CCD’s is given. The results 
for a surface CCD can be obtained as special cases of the results for 
buried channel CCD’s. We refer the reader to I for a more detailed 
derivation of the equations and for a discussion of their linearization 
by the depletion layer approximation. 

As we show by examples, fairly complicated models of CCD’s can 
be analyzed at moderate cost by the methods of this paper. Neverthe- 
less, the cost of using the methods of I to analyze a CCD with zero 
separation between the electrodes is typically an order of magnitude 
less than the cost of using the methods of this paper to analyze a CCD 
with nonzero electrode separation. This suggests that, in any com- 
plicated design problem, the methods of I should be used to rough out 
a solution, and then the solution should be “fine tuned” by using the 
methods of this paper. In addition, it is shown that, when the gaps 
between the electrodes are of the order of 1 um, the potentials of 
interest are approximated well by the potentials in the same CCD 
with the electrode separation set to zero. 

The nonlinear equations and boundary conditions defining the 
boundary value problem are introduced in Section IJ. The linearized 
equations are also introduced there. In Section JII we discuss in some 
detail how we obtain approximate solutions to the linearized problem. 
The reader uninterested in the mathematical details should skip Section 
III and proceed directly to Section IV, which is devoted to examining 
some of the solutions with emphasis on how they are affected by 
changes in the design parameters. We examine a number of different 
design parameters, particularly for buried channel devices. The accu- 
racy of the solutions and the cost of obtaining them is considered in 
Section V. Finally, some mathematical details are contained in two 
appendixes. 


Il. THE POTENTIAL EQUATIONS 


We consider CCD’s in which the minority carriers are holes and the 
underlying substrate is n-type silicon. The analysis can be modified in 
an obvious way to describe the case where the minority carriers are 
electrons and the substrate is p-type silicon. 

A buried channel CCD consists of a substrate of n-type silicon on 
top of which there is a layer of p-type silicon. The p-type layer is 
covered with a layer of SiOz, and closely spaced electrodes are placed 
on top of the oxide layer. A schematic diagram of such a device is 
shown in Fig. 1 with some typical dimensions indicated. A surface CCD 
is the same, except that the p-type layer is missing. 
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Fig. 1—A schematic diagram of a buried channel CCD. 


We study here the static potential and fields in either a surface or 
buried channel CCD in the absence of mobile charge. Since the length 
in the z-direction of each plate is much greater than its width in the 
z-direction, near the center of the plates (ze =0) the field is essentially 
two-dimensional; therefore, we treat the problem as two-dimensional. 

We assume the bottom (n-type) substrate is infinitely thick. The 
field can penetrate into the substrate little beyond a depletion depth 
and, since for typical voltages the depletion depth ranges from 7 to 
20 um and the thickness of a typical device is 100 um, this is a very 
good approximation. 

It is assumed that there are gaps between the electrodes. We also 
make the approximation that the electrodes have zero thickness. 
Although this is a rather drastic simplification, we feel the essential 
effects of the gaps between the electrodes are still properly described. 
It will be seen later that electrodes of rectangular cross section could 
be studied, although at much greater cost. We further assume that the 
medium surrounding the electrodes has the same dielectric constant as 
the SiO». This is very reasonable, since in practice a CCD is covered 
with a dielectric coating. Two basic types of metalization are studied, 
single level and double level. In double-level metalization, two layers 
of electrodes are separated by an oxide layer. This is illustrated 
schematically in Fig. 2. We simulate this situation by assuming that 
the potential distribution in the gaps between the electrodes in the 
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Fig. 2—A schematic diagram of one cell of a buried channel CCD with double- 
level metalization. 


upper layer of electrodes is a known function. (Typically, the potential 
is assumed to vary piecewise linearly.) In single-level metalization, the 
upper level of electrodes is missing. Here we assume the dielectric 
coating over the electrodes is infinitely thick. Again, this is a reasonable 
assumption, since typically the field will have died out before reaching 
the surface of the dielectric coating. 

Finally, we assume the structure to be periodic in the x-direction, 
which in the usual mode of operation is an excellent approximation. 

The boundary value problem corresponding to our model of a buried 
channel CCD can be described by a system of partial differential 
equations which we wish to write in terms of dimensionless quantities. 
All dimensional quantities (measured in rationalized MKS units) will 
be starred, with the exception of a few obvious physical parameters. 
Corresponding unstarred quantities will be dimensionless. The physical 
parameters of the problem are e; and e2, the permittivity of the oxide 
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and silicon, respectively; —e, the charge of an electron; Boltzman’s 
constant k; the absolute temperature 7’; the length of a unit cell in the 
device L*; the separation of the two layers of metalization h_,; and 
hi and hg, the thickness of the oxide layer and the p-layer, respectively. 
The donor number density of the n-type substrate is Np, which in the 
usual method of fabricating a CCD is a constant. However, we assume 
the acceptor number density in the p-layer is given by the expression’ 
*_ pt \2 * 

Nae) = Crexp |—~(B=3t) ene} — NB, 
where C; is the number density of acceptor ions at the upper surface 
of the Si. 

Now define the (dimensional) Debye length Xp, 


A = (eckT/e?Np)?. (2) 
Then the dimensionless lengths are defined as 
c= a*/ Xp; Y= y*/Xp, L= L*/Xp, ha = a/ AD, (a =+1, 2) (3) 


The dimensionless potential is related to the dimensional potential by 








o(a, y) = ep*(2*, y*)/kT. (4) 
If we set 
C, = C/N, (5) 
then the dimensionless p-layer acceptor density, o(y), is 
= 2 
o(y) = C, exp } — ya ) fnC,} —1. (6) 
he — hy 


In the strip 0 Sx S JL, let go denote the electrostatic potential 
above the oxide layer, — © < y S Oin the case of single-level metaliza- 
tion and —h_1 S y S$ 0 in the case of double-layer metalization. 
Further, let ¢,; denote the potential in the oxide layer, 0 S$ y S hi; ¢ 
the potential in the p-type layer, hi S y S he; and ¢s the potential in 


the n-type substrate (see Fig. 2). Then in the dimensionless form, the 
potential equations are 


V2 90 = 0, y $0, (7) 
V2e1 = 0, OSysShm, (8) 
V2g2 = o(y), hasysh, (9) 
V?ys; = exp(¢s)—-1, =khSy<~o, (10) 


where V? is the two-dimensional Laplace operator. The standard 
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electromagnetic boundary conditions are as follows: 
| go(x, — 0)|< © (single-level metalization), and (11) 
go(x, —h_1) = U(x) (double-level metalization), (12) 


where U(z) is a given periodic function of period L, assuming on each 
electrode of the second level of metalization the constant voltage of 
the electrode and a specified potential between the electrodes. In 
reality, of course, the true potential in the gaps of the second level of 
metalization is unknown a priort. However, as indicated in Fig. 2, 
typically the semiconductor cannot ‘see’ these gaps since they are 
shielded by the electrodes of the first level of metalization. Thus we 
simulate the exact boundary conditions in the gaps, most often by 
assuming the potential varies linearly from one electrode to another. 
We feel this is a good approximation, since we have performed calcu- 
lations of the potential in the semiconductor with several different 
assumptions about the variation of the potential in the gaps, and the 
results were essentially identical. Further, 


pox, 0) = V; = gi(2, 0), (9 = i 2, eeay ,P), (13) 
yo(, 0) = g(x, 0), a 0) = 4 = (2,0) + pale), (14) 


where V;, is the constant voltage of the jth Fes in the first level 
of metalization, eq. (13) holds on each of the p electrodes, and (14) 
holds in the gaps between the electrodes. Typically, p,(%) = 0, but in 
some cases it may describe a deliberately implanted surface charge in 
the gaps. In any event, p,(x) is a known function of x in the gaps, and 
p (x) = 0 on the electrodes. Finally, 


ewe. 4 3 (x, ha) = e (x, hi) + Q(x), (15) 


gat, ha) = 93(@ ha), 2 (wha) = SE* (wn), (16) 
vs(z, ©) = 0, (17) 

and for all y 
00,9) = o(L,y), 520,y) = 5 C,y). (18) 


In (15), Q(x) is a known, periodic surface charge density, which may 
include deliberately implanted charges,’ and 


7 = €1/€2. (19) 
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All the boundary conditions (11) to (12) and (14) to (17) hold for 
Ose L. 

The equations for the potential in a surface CCD are essentially the 
same, except the p-type layer is eliminated. We only give the analysis 
for the buried channel CCD. The results for the surface CCD can be 
obtained from those for the buried channel CCD by setting o(y) = 0, 
hy = hy, and g2 = gs. In either case, the fields are obtained from the 
potential by 

E =— Veo. (20) 


The results of I show that the system of eqs. (7) to (18) can be 
accurately solved by the method of finite differences only at great 
expense for even the simplest of devices. However, it was shown in I 
that, for the simpler problem studied there, the nonlinear boundary 
value problem could be replaced by a linear boundary value problem. 
This linear problem was solved analytically. It was then shown that 
under appropriate conditions the solution of the linear problem was an 
excellent approximation to the solution of the nonlinear problem in 
the p-type layer for a buried channel CCD and near the oxide-semi- 
conductor interface for a surface CCD. The condition for the approxi- 
mation to be a good one is, basically, that the potential along the line 
y = hz be large and negative. That condition holds in this problem, as 
we show later by example. Although the linear problem is much more 
complicated in this case because of the gaps between the electrodes and 
we have been unable to solve it analytically, we have been able to 
obtain good approximate solutions of it. For these reasons we now 
formulate the linear problem. 

The linear equations are based on the depletion layer approximation, 
and we refer the reader to I and Ref. 4 for a detailed discussion. The 
linearization consists of replacing the single region he S y <~ by two 
regions, he < y S hs = ho + R (the depletion layer) and hs Sy <~, 
and replacing the single nonlinear equation (10) by a different linear 
equation in each of these subregions (see Fig. 3). 


V*po(z, y) = 9, y $0, (21) 
Vi(x, y) = 0, Osysh, (22) 
V*p2(z, y) = oy), Sy Sh, (23) 
V*pa(z, y) =—-1, hoSySh=ht+ vig (24) 
Vaz, y) = wr, y), ls Sy <om. (25) 


In addition to Yo, ¥1, v2, and w3 satisfying boundary conditions (11) to 
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Fig. 3—A schematic diagram of the depletion layer approximation for one cell of a 
buried channel CCD with double-level metalization. 


(16), we have the boundary conditions for0 S$ x S$ L 
) 
vale, ha) = vale, ha), F2(@ hs) = FE (eh), (26) 


pa(x, ©) = 0, (27) 
and the 2, (a = 0, 1, 2, 3, 4) all satisfy (18). The pseudodepletion 
depth R is best given by 


, hy 


R=-(1+h—h+")+[(m— m+ 2) —1-27 


° } 
- 789,42 f"(¢- n+) o(@)de]- (28) 
n hy 7 
In (28), 


Q.=7 ii ” Q(x)de 
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and V is the average electrode voltage. As shown in I and Ref. 4, Wo, 
vi, and v2 are quite insensitive to the choice of R, and an optimal 
choice of & tends to minimize sup |wW4(z, hs) + 1]. 

O<z<L 


III. SOLUTION OF THE LINEARIZED POTENTIAL EQUATIONS 


The method we shall use to solve the linear system of elliptic eqs. 
(21) to (27) has much in common with previous work? on the classical 
problem of a single linear elliptic boundary value problem on a simply 
connected domain in the plane. The technique used in Ref. 9 is quite 
simple: Construct a family of particular solutions of the partial 
differential equation and, using a finite linear combination of these 
particular solutions, obtain a Chebyshev fit to the boundary conditions 
at a finite number of points on the boundary of the domain. It was 
shown that, as more and more particular solutions are taken in the 
linear combination and more and more points are chosen in the fit on 
the boundary, the linear combination converges to the true solution. 

In this paper we construct a family of particular solutions of (21) 
to (27). These solutions depend linearly on a finite number N of param- 
eters and satisfy all the boundary and interface conditions except that 
they do not assume the correct voltages on the electrodes at y = 0. 
We then complete the analogy with Ref. 9 by picking M points on the 
plates, x;,1 $7 M, where M = N, and force the potential to take 
on the correct value at the points x; in the least-squares sense. 

We obtain the family of particular solutions in the form of Fourier 
series. We assume as given the Fourier series expansions of U(x) and 


Q(z): 


U(2) = Jao + X Gala), (29) 
Q(z) = ito + E (2), (30) 
where 
Gn(2) = an COS Ane + Bn SiN Ane, (81) 
®,(2) = En COS Ane + E, SIN And, (32) 
and 
An = (2nm)/L. (33) 


Further, from I, we can write down formal expressions for Yi, Ye, 3, 
and yw. which satisfy (22) to (25) and boundary conditions (15), (16), 
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(18), (26), and (27). Let 


F (2) = Gn cos Ant + 0b, sin Ane, (34) 
Ey = 1 +& 1/n, (35) 
A#=14(1+A,%)}, (36) 


M,(y) = {HyAsd + B_Ajz e7? alisha) } e-dny 
+ {H_Afe- 4 4+ Hy Apen*nks}edny, (37) 
Laly) = 2{Atermy + Ave Pathe}, (38) 
where 7 is given by (19), An by (83), and do, dn, bn (n = 1, 2, ---) are 
unknown constants. Then these expressions are 
¥ilx, y) = (Ado + B) + (Cao + D)y 


+B [rc MP + oxy 42 ehat) | 











baa, w) = 5 | dal + ha ~ hs) — (hs ~ ha)? 
~ (da + 2he ~ 2ha\(y — be) +2 f" W— f)o(Eae| 








+ = {F n(%) + ®, (x) sinh a nly ne ; (40) 
v3(2, y) = 2Lao(1 + h3 — y) — (y — hs)? | 
+2 {Fe cane | a 
Walz, y) = Ge + 4 > {Ps (x) + @ a(o) Set | 
exp [—Vi + AR(y — hs) — Anka], 
MAO) (42) 
where 
1 hi 
A=5(1+m—h+), (43) 


B -{- (: moe (3 Nae ™ o(eae 
= ; | ( — he)(hs + he — 2hi) + 2 (to + 2hs — 2h) | , (44) 
C =— 1/(2n), (45) 
= E + 2(hg — he) — 2 A ‘ o(é)aé| i. (Qn), (46) 
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and 
Go = ($a) — B)/A. (47) 
It should be noted that 
Wi(@, 0) = $a + Y Fa(a). (48) 


In the case of single-level metalization we can write down an expres- 
sion for a solution of (21) in y S$ 0 which satisfies boundary conditions 


(11): 

Wo(t) = 30 + a F,(a)e», (49) 
In the case of double-level metalization, a solution of (21) 
—h_1 S y S$ 0 satisfying boundary condition (12) is 


a a sinh \n(y + A_1) 
Yo(a, y) = 5(1 +5 a L ar 2 F..(2) sinh \,h_1 





Qo sinh Any 
eve = EG) Sey ee) 


Note that these solutions have been constructed so that from (48) 
and either (49) or (50), forO0 S x S L, 


Yo(x, 0) = ya(z, 0). (51) 


[Note also that, term by term, (49) is the limit as h_1 © of (50). ] 

Equations (34) to (50) contain expressions for ye, 0 S a S 4, which 
satisfy the differential equations (21)—(25) and all the required bound- 
ary and interface conditions except condition (13) on the plates and 
the normal derivative condition of (14) in the gaps. These particular 
solutions contain the unknown parameters do, Gn, bn, (n = 1, 2, ---), 
which remain to be determined. At this point it might be assumed that 
the series should be truncated at some n = N and the 2N + 1 coeffi- 
cients Go, Gn, bn, n = 1, 2, ---, N, be determined directly by making a 
least-squares fit to the remaining boundary conditions. However, it 
can be shown” that if z;1 < x; are the end points of an electrode on 
y = 0 then for x1 < x < xj, and x near 2;, say, dyo/dy(z, 0) will 
behave like (x; — x)~? plus a power series in (x; — x)?. This implies 
that the Fourier series for Yo(x, 0) couverges very slowly. In fact, we 
have found that it is often necessary to take up to 2000 terms in the 
series to represent yo(x, 0) adequately. This makes it impractical to 
use the Fourier coefficients themselves as the parameters to be deter- 
mined directly by a least-squares process. 
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Instead, we used the following technique. We approximated the 
charge density on y = 0 by a finite sum of known functions 


p(t) = pale) + Y pips) (52) 


The functions p,;(x) are zero except on the electrodes and are of several 
types, as shown in Fig. 4. The function po(x) is a periodic, triangular 
spline for the ends of the device, as shown in Fig. 4a. Corresponding to 
the edge of each plate, there is a discontinuous triangular spline, Figs. 
4b and 4c, and singular splines of the form |x — 2, |~—?, Figs. 4d and 4e. 
The remaining p;(x), whose supports lie wholly interior to the elec- 
trodes, are triangular splines as shown in Fig. 4f. 

Now each unknown parameter do, Qn, bn (n = 1, 2, ---) is deter- 
mined as a linear sum of the N parameters p;,1 S j S N, by equating 
p(x), given in (52) with dyo/dy(x, 0) — dyi/dy(x, 0). In the case of 
single-level metalization, from (39) and (49), 


OYo Ov 
Oy (x, 0) — Oy (x, 0) 











=o (Cao + D) —= 2 Fa(@)AnEn an > ®,,(x) nM, (0) ) (53) 
where 
_ MO) _ 
ales 8) al 
It should be noted from (88), (85), (37), and (88) that, as n >-, 
E,&— 2, Linlhy) py 2 gaat, (55) 


7M ,(0) “~ 1+ 7 


In the case of double-level metalization, from (39) and (50), 











Oo -_ oy _ 40 — A ~ = = 
By (x, 0) ay (x, 0) = Tha (Cao + D) > Pile) NaH a 
= An = La(hi) 
7 2s Ga) ey a) 
where 
_ M0) _ 
H, = X.M,(0) ctnh (A,h_1). (57) 
It again follows from (338) and (37) that, asn ©, 
H, RS 2. (58) 


For a periodic function f(x), of period L, we denote by c,nLf] and 
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Fig. 4—The splines used to represent the charge density on the electrodes. 


SnLf] the cosine and sine Fourier coefficients of f: 


elf] = + / ovr ene Ane e f $a) wit Og) da. (59) 


Then, from (52), the Fourier series for p(x) is 


1 N 
o(@) = 5 faloe] + & exLn.d| 


=1 


Jj 
aes 


n=1 


{ese} + 3 psalps]} cos dat 


+ | sLe] + 5 pssaLil| sin \,v. (60) 
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In Appendix A we give c,[p;| and s,[p;] for the various functions 
p;(x). If we now equate (53) with (60), we can obtain the a, and b, 
as linear functions of the p; in the case of single-level metalization: 


_2(BC —AD)_A uae 
ay = AEE) — Fleas] + & oni) (61) 


Oy =— [taLn(ha)V/[n\nBaM (0) 
— featool + % excaCoadl /OnBe), (62) 
j= 
by = — [EaLn(hs)]/[n\nBnM.(0)1 
= {sale aS eweLpal| JOB). (68) 
j=l 


Similarly, if (56) is equated with (60), we obtain the a, and b, in the 
case of double-level metalization: 


ag we Dhe(BC = AD) — oA Als 
a (haG — A (hCG — A 


Jeter + % ewan}, (64) 


me) An a €nDn(hi) 
H, sinh (hada) nH M,(0) 


— lester + paetpsl} / Oats), (68) 


po =— — Ba nna) _ 
"=~ Fr sinh (hada) ghelTaMf,(0) 


— fester + % osseCnil} / Onis). (66) 


Equations (39) to (42) and either (49) or (50), with the a, and 6, 
defined by eqs. (61) to (63) or (64) to (66), respectively, define solutions 
to eqs. (21) to (25) which satisfy boundary conditions (11) or (12), 
(14) to (16), (26), and (27). They do not, however, assume the correct 
values on the electrodes at y = 0; ie., (13) is not satisfied. These 
solutions depend linearly on the N unknown constants p;,1 S j SN, 
and of course on the choice of functions p,(x) used to describe the 
charge density on the plates. Having picked the p,(x) described 
earlier, 1; points are chosen on the jth electrode, x{?, 1 <7 < M;, with 


Qn 


M=>M;=N. 
j=1 
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Then the expression 
M; ; 
> & Lvoles?, 0) — Vi (67) 
j=l i= 


is minimized with respect to the pa. 

It has been observed that near the edge of an electrode, where the 
potential has square-root behavior, the fitting points x{? should be 
spaced quadratically closer together as the edge of the electrode is 
approached. If x = e is an edge of the jth plate, then we distributed 
the points near this edge by picking b ¥ e and an integer m < M,/2 
and setting 


z= b+ (0 VNoosst— VF, (LS ism). — (68) 
The points z; were then used as the fitting points x; near the edge of 
the plate. Away from the edges of the plate, the fitting points were 
uniformly distributed. 

We have assumed the existence of a bounded solution (xz, y) to the 
linearized problem. In Appendix B we show that if y(z, y) is the true 
solution of the linearized problem and y2(z, y) is one of our approximate 
solutions, then the error ¥(x, y) — w2(x, y) is bounded at every point 
by the maximum error on the electrodes. Since the true solution is 
known on the electrodes, this provides us with a posterior2 error bounds. 
We will make use of this important point later in evaluating the 
quality of our approximate solutions. 

The technique described in this section can be formulated in a 
rather general setting and, we believe, can be applied to many problems 
of interest in physics and engineering. It has been used by Morrison 
et al. in a study of microwave scattering by deformed raindrops." 
Assume that a problem can be separated into two parts: Input data 
and a governing system of partial differential equations (PDE’s), with 
possible interface conditions, which determine the solution when given 
the input data. Further, assume that linear families of particular solu- 
tions to the PDE’s can be found. For example, these may be con- 
structed by separation of variables, Fourier series, Green’s Theorems, 
etc. Finally, assume that by linearly parameterizing some unknowns 
of the solution (for our problem, the charge distribution on the plates) 
we can obtain particular solutions to both the PDE’s and the interface 
conditions. Then one could use some fitting procedure, a discrete 
least-squares fit, for example, to force the linear family of particular 
solutions to the governing system to have approximately the same 
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input data as the desired solution. This generates a solution of exactly 
the same governing system, but with input data which differs by a 
known, and hopefully small, amount from the desired input data. For 
all practical purposes, this gives an effective bound on the error in the 
computed approximate solution. For example, if the desired input is 
only known to 1 percent, because of experimental error in measuring 
it, then any solution generated by the above procedure which corre- 
sponds to the desired input data perturbed by at most 1 percent 
cannot, on the basis of comparing inputs, be distinguished from the 
true solution of the problem. 

Also, in many cases, one can use the Maximum Principle, conserva- 
tion of energy, or some other basic principle to give sharp, rigorous 
bounds on the error in such an approximate solution in terms of how 
well it satisfies the given input data. We do this for our problem in 
Appendix B. This is a very great improvement over the standard 
discretization methods for solving such problems. Those methods 
generally give an approximate solution to an approximate system of 
equations, but with exactly the given input data, with the result that 
it is very difficult to estimate reliably what the true error is for a given 
approximate solution. 


IV. THE POTENTIALS AND FIELDS IN SOME SPECIFIC CCD’S 


Using the method described in Section III, we have evaluated approx- 
imately the solutions of eqs. (21) to (25) for a number of different plate 
configurations and design parameters, and we present some of these 
results graphically in this section. 

We have assumed in each case that the n-type substrate doping is 
N> = 10" cm-, that e2/e9 = 12, where eo is the permittivity of free 
space, that e1/e2 = 4, and that Q(x) = 0, ie., there is no trapped or 
implanted charge at the oxide-semiconductor interface. Then at 
T = 300°K, the Debye length is \) = 0.415 um. In addition, we have 
used the factor (kT/e) = 0.025 V to convert dimensionless potentials 
to volts, and the factor (kT/e\p) = 600 V/cm to convert dimensionless 
fields to volts per centimeter. In each example involving a buried 
channel CCD, we assume that the acceptor number density in the 
p-layer is given by (1), [(6)] with C} = 4.6 X 10%cm-*[(C, = 46]. In 
each such case, this corresponds to an average number density of 
acceptor atoms of 2 X 10'5 cm7? [see eq. (2.5) of I]. 

In I we investigated the effects of changing the p-layer doping and 
thickness, and so here we concentrate mainly on the effects of gap 
width, plate potential, and the separation of the levels of metalization. 
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Fig. 5—The channel potential ¢* plotted as a function of x* for a three-phase 
buried channel CCD. The 45-um-wide electrodes are at 0, —5, and —10 V; the gaps 
are 5 wm wide; h*# = 0.1 um; h¥ — h¥ = 5 wm; and C¥ = 4.6 X 10% cm. The 
dashed curve is for no surface charge implanted in the gaps; the solid curve is for 
pg/e = 0.8 X 10" cm implanted in the gaps. 


In Figs. 5 and 6 we show some properties of a three-phase buried 
channel CCD with single-level metalization. The electrodes are 45 um 
wide, and the gaps between them are 5 wm wide. The p-layer is 
5 wm thick (hk; — hi = 5 ym), and the oxide layer is 0.1 wm thick 
(hi = 0.1 ym). The region y < 0 is assumed to be filled with SiO ». The 
potentials on the electrodes are 0, —5, and —10 V, as shown. The 
dashed curve in Fig. 5 shows the channel potential ¢* (that is, the 
value of the potential at the potential minimum in the p-layer) as a 
function of x* when there is no implanted surface charge in the gaps 
between electrodes [p;(z*) =O]. This curve illustrates one early 
difficulty encountered in the design of buried channel CCD’s, namely 
the large potential well under the gap between the plates. A CCD 
with almost these same parameters was constructed® and did not work 
because of the variable amounts of charge trapped in these wells. In 
the remainder of this section we discuss a number of possible methods 
of eliminating this potential well in the gaps between the plates. 
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Fig. 6—The channel field —@¢*/dz* plotted as a function of «* for the CCD of 
Fig. 5 with px/e = 0.8 X 10" cm? implanted in the gaps. 


An operating buried channel CCD has been reported}? in which the 
gaps between the electrodes have been filled with a resistive material 
so that the potential drop between the electrodes is essentially linear. 
This CCD was also discussed in I, andit was shown there that the 
potential wells are eliminated. Another technique for eliminating the 
potential wells is to implant a layer of positive surface charge in the 
gap between the electrodes. (Other schemes for eliminating this 
problem are discussed in the literature.!*) The solid curve in Fig. 5 
shows the channel potential in the same three-phase CCD after a uni- 
form surface charge density, p;(x*)/e = 0.8 X 10!2 cem-? [p,(x) = 578], 
has been implanted in the gaps between the electrodes. Note that 


h* 
[eo Na@*ay* = 10 em, 
Al 
This technique should also eliminate the potential wells under the 


gaps. In Fig. 6 we plot the channel field HE} =— 0o*/dx* (that is, the 
field at the potential minimum in the p-layer) as a function of 2*. 
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Fig. 7—The channel potential in a buried channel CCD with double-level metaliza- 
tion. The lower level electrodes are 10 um wide with 5-um gaps and are at potentials 
of —5 and —7 V. The upper level is a single electrode at —5 V; ht = 0.1 um; 
h} — h¥ = 5 pm; C¥ = 4.6 X 10% cm-; and the separation of the metalization levels 
is h*, = 0.1, 0.5, and 1.0 um. 


This shows that there are substantial fields in the gap between the —5 
and —7 V electrodes, but the field penetration under the electrodes is 
not too good. 

In Figs. 7 to 9 we investigate the possibility of eliminating the poten- 
tial wells by the use of double-level metalization, in which the upper 
level of metal is a single continuous piece covering the entire channel 
and has a de potential applied to it. The presence or absence of the 
potential wells is mainly a local phenomenon and can be studied by 
considering just two adjacent electrodes in the lower level of metaliza- 
tion. Thus, in the interests of economy we consider a model CCD in 
which alternate electrodes of the lower level are at the same voltage. 
These plates are 10 um wide and the gaps between them are 5 ym 
wide. The oxide layer between the first level of electrodes and the 
p-layer is 0.1 um thick (hj = 0.1 um) and the p-layer is 5 wm thick 
(hg — hi = 5 pm). For the CCD of Fig. 7, the electrodes on the lower 
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Fig. 8—The channel potential in the CCD of Fig. 7 with the separation of the 
metalization levels held fixed at h*, = 0.5 um and the potential of the center, lower- 
level electrode taking the values —7, —10, —15, and —20 V. 


level are at a potential of —5 and —7 V, and the upper level consists 
of a single electrode, covering the whole device, which is at a potential 
of —5 V. In Fig. 7 we plot the channel potential for three different 
separations of the levels of metalization, h_, = 0.1, 0.5, and 1 um. Even 
with h*, = 0.1 um, there is still a very slight well in the gaps. 

We next see what happens if we hold the separation of the levels of 
metalization at h., = 0.5 uw and change the voltage, vo, on the middle 
‘electrode in the lower level. In Fig. 8 we plot the resulting channel 
_ potential, »*, as a function of x* for v1 =—7, —10, —15, and —20 V. 
From these graphs we see that a potential difference of 15 V between 
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Fig. 9—The channel potential in the CCD of Fig. 7 with the separation of the 
metalization levels held fixed at h*, = 0.1 um and the potential of the center, lower- 
level electrode taking the values —7 and —10 V. 


neighboring electrodes on the lower level will insure the absence of 
potential wells in the gaps. 

In Fig. 9 we plot the channel potential for the same device but 
with h_, = 0.1 um and for % =—7 and —10 V. For this small separa- 
tion of the levels of metalization, a 5-V potential difference between 
neighboring electrodes eliminates potential wells in the gaps. 

In Figs. 10 and 11 we show some effects of gap width. We consider 
first a buried channel CCD with double-level metalization. The p-layer 
is 5 wm thick (hz — hi = 5 um), the oxide layer between the first level 
of electrodes and the p-layer is 0.1 wm thick (hj = 0.1 wm), and the 
layer between the two levels of electrodes is 0.5 wm (hi, = 0.5 um). 
The upper level of electrodes consists of a single electrode at a potential 
of —5 V. The lower level consists of electrodes 10 um wide and, as 
shown in Fig. 10, they are alternatively at potentials of —5 and —7 V. 
Curves of the channel potentials are plotted for three different gap 
widths between plates: 5, 1, and 0 um. (The 0-gap curve was calculated 
by the methods of I.) The x* scale for the three curves are different, but 
are chosen so the centers of the gaps coincide. With 5-um-wide gaps, 
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Fig. 10—The channel potential in the CCD of Fig. 7 with the separation of the 
metalization levels held fixed at h*, = 0.5 um, and the gap width taking the values 
5, 1, and 0 um. 


there are large potential wells, as we saw in Figs. 7 and 8. However, by 
reducing the gap width to 1 um, the potential wells are essentially 
eliminated. The curve for zero electrode separation is included to show 
that it is a good approximation to the channel potential in cases of 
small electrode separation. 

Finally, in Fig. 11, we plot the potential along the oxide-semicon- 
ductor interface (y* = hj) for two surface CCD’s. In each case, the 
oxide layer is 0.1 ym thick (hj = 0.1 wm) and the region y* < 0 is 
assumed to be filled with SiO». Also, in both cases, the electrodes are 
10 wm wide and are held at alternate potentials of —5 and —7 V. In 
one case the gap between electrodes is 1 um, while in the other case 
there are no gaps between electrodes. (The zero gap curve was calcu- 
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Fig. 11—The potential y*(x*, h¥) plotted as a function of x* for a surface CCD 
with single-level metalization; hf = 0.1 4m; 10-um-wide electrodes held at —5 and 
—7 V; and with gaps between electrodes of 1 and 0 um. 


lated by the method of I.) Again the x* scales differ, but the centers of 
the gaps coincide. Except in the region between the plates, the two 
curves coincide closely. 


V. COMMENTS ON ACCURACY AND COST 


We considered in some detail in I how well the solutions of the 
linearized equations (21) to (25) approximate the solutions of the 
nonlinear equations (7) to (10). It was shown there and in Ref. 4 that 
as long as max ¥i(2, hi) S —160, and |Ya(z,h3) +1| <10,0SceS1L, 


[lvs(z, he) +1] <10 for surface devices] then the solution of the 
linearized problem approximates the solution of the nonlinear problem 
to within several percent in the p-layer for buried channel devices 
(near the oxide-semiconductor interface for surface devices). In addi- 
tion, we are interested in examining the accuracy of the approximate 
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solutions of the linearized equations. As we stated in Section III (and 
prove in Appendix B), the difference between the exact solution and 
the approximate solution, 6*(x*, y*) = p*(x*, y*) — y*(a*, y*), is 
bounded everywhere by its maximum value on the electrodes. 

As typical examples, consider the curves in Figs. 10 and 11. For 
all three curves in Fig. 10, we found that yi(z, hi) S$—345 and 
|Wa(z, he) + 1| < 2.75 forO S x S L. For the two curves of Fig. 11, 
we found that ~i(x, hi) <—186 and |¥3(z, he) + 1| S lforO SxS L. 
Furthermore, for the curves of Fig. 10, a search of the electrodes showed 
that, for 5-um gaps between electrodes, max | 6*(a*, y*)| S 0.29 V 
and, for 1-um gaps, max | 6*(x*, y*)| < 0.18 V. (The no-gaps curve was 
calculated by the methods of I.) These correspond to maximum errors 
of the channel potential of 1.3 and 0.83 percent respectively. For the 
curve for the surface CCD with 1l-um gaps between the electrodes a 
search of the electrodes showed that max | 6*(z*, y*)| S$ 0.4 V. (The 
no-gaps curve was again calculated by the methods of I.) This corre- 
sponds to a maximum percentage error of 0.86 percent. 

For the curves of the buried channel CCD of Fig. 10, we have 
undoubtedly overestimated the error in the channel for the following 
reasons. It can be shown that the error in each coefficient a, and b, 
appearing in (34) can be expressed as an integral over the electrodes of 
the error 6(z, y) times a weight function. The sign of 6(x, y) oscillates 
on the electrodes, and so one would expect the error in the lower-order 
coefficients to be quite small. Furthermore, an examination of (34) to 
(88) and (40) shows that, for the parameters involved, only the first 
ten terms in (40) contribute significantly to the channel potential. 

To present some idea of the cost of running these programs, the 
calculation of the solution for the case of 5-um gaps in the curves of 
Fig. 10 took 253 seconds and used 40 K of core, the case of 1-um gaps 
took 258 seconds and 40 K of core. Calculation of the solution for the 
l-um gap case of Fig. 11 took 263 seconds and used 40 K of core. By 
comparison, the two corresponding no-gap solutions, obtained by the 
methods of I, took 16 seconds and 32 seconds, respectively, and both 
solutions required 40 K of core. 


VI. ACKNOWLEDGMENTS 


The authors take pleasure in thanking G. E. Smith for originally 
suggesting this research and for many subsequent conversations. They 
also benefitted from many conversations on the subject of CCD’s with 
R. H. Krambeck, R. J. Strain, and R. H. Walden. 


CCD POTENTIAL 1789 


APPENDIX A 


This appendix lists the coefficients of the Fourier series of the 
various splines used in approximating the charge density on the elec- 
trodes. We begin by defining the function 


IIA 
& 

IIA 
oo 


~“e 


u(x, h) = (1 7 i): ° (69) 
0, h 


IIA 


es Fy: 


Outside the interval 0 S$ x S L, this function is defined by periodicity, 
u(x, h) = u(a + L, h). The Fourier coefficients of u(x, h) are 


an(h) = c,Lu] = — (1 — cos A,h), 
- (70) 
bh) = she = The (Anh — sin d,h), 


where the notation c,[w] and s,[] is defined in (58), A, = 2na/L, 
and n = 0, 1, 2, ---. Note that 


h 

ao(h) = TZ’ bo(h) Hi. (71) 
The triangular splines can all be expressed in terms of u(x, h), and 
their Fourier coefficients are simple linear functions of the coefficients 


a,(h) and b,(h) defined in eq. (70). Thus (see Fig. 4a): 


poz, h) rs u(x, h) oF u(L — &, h), (72) 
and for n = 0, 1, 2, -:-:, 
CnLPo] = 2an(h), SL Po] = O. (73) 


Similarly (see Figs. 4b and 4c), we have the end splines 
pu(2; 0, h) = u(x — Xo, h), p(x; v0, h) = u(Xo — &, h), (74) 
and for n = 0, 1, 2, --: 


Cn pe] = (COS An®o)An(h) — (Sin AnXo) bn(h), (75) 
SnL pe] = (SIN AnXo)An(h) + (COS An®o)bn(h), 


and 
Cn[ Pr] = (COS AnXo)An(h) + (Sin An%Q)b,(h), 


SL Pr] = (Sin Xnva)an(h) — (cos nea) bn(h). (76) 
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Finally (see Fig. 4f), 

t(x3 Xo, ha, he) = u(to — x, hi) + u(x — Xo, he), 2% ¥ to(mod L) 
= 1, 2x = x(mod L) (77) 

and 

CnLt] = COS An®o{Gn(hi) + an(he)} + sin AnXo{bn(hi) — bn(he)}, 

Salt] = sin An®o{An(hi) + an(he)} — cos AnLo{ba(hi) — bn(he)}. 


The coefficients of the Fourier series of the singular edge splines were 
calculated as follows. We define (see Figs. 4d and 4e) 


(78) 


=(4#—-4)3>-—h?, wm<¢sath, 


8¢(x3 Xo, h) = (79) 


=0, OSrt8S8um, mthA<rsl, 


and 
8(23 20, h) = s¢(2x9 — x3 Xo, h). (80) 
Then after an integration by parts 
Cn se] = 
4 sin An(%o + h) — sin Ant 
= — + al cee EN a 
Z E COS An(%o + h) Dr. Hl 
zoth 
+n f°" (= am)! sin od , (81) 
Zo 
8.08] = 


COS An(Xo + h) — COS Ano 
2r,ht 


lI 


wales 


E eee rk 
zoth 
= Xx i: (x — 20)? cos race |. 
xo 


The integrals on the right of (81) were evaluated by quadratures using 
Filon’s method." Similarly, 


crLs, | = 
4 sin AnZo — SIN An(%o — h) 
= 2 —_ = ee Eee 
Z, | COS An(X%o — h) nH 
Na . (% — x)? sin rade | , (82) 
ag—h 
S,Ls,| = : 


COS An¥Zo — COS An(Xo — A) 


4 : 
= 3 as 
Zr E sin An(% — h) + Dn, ht 


+ An te (ro — x)? cos rade |. 
zo—h 
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APPENDIX B 


We assume that a solution of equations (21) to (25) satisfying 
boundary and interface conditions (11) or (12) and (13) to (16), (18), 
(26), and (27) exists and is bounded. As before, we denote this solution 
by ¥(az, y), we denote our approximate solution by ¥*(z, y), and we 
define 

E(x, y) = ¥(z, y) — ¥2(z, y). (83) 


By wa, ¥4, and &,, (0 S a S 4), we mean y, y*, and é restricted to the 
various subdomains. 

From their construction, the approximate solutions satisfy the same 
equations and boundary and interface conditions as the exact solution, 
except they do not assume the correct values on the electrodes. 
Consequently 


Vet, y) =0, (OSa 838), VEs(x, y) — E(x, y) = 0. (84) 


Also &.(x, y) satisfies either boundary condition (11) in the case of 
single-level metalization or 


Eo(z, —h) = 0 (85) 


in the case of double-level metalization. In addition, £0(2,0) = é1(a, 0), 
0S2SL,and £,(2, y) satisfies (14) with p, = 0, (15) with Q(x) = 0, 
(16), (18), (26), and (27). 

We now outline a proof that if M = sup | £o(x, 0) |, where H denotes 


the electrodes, then | &(x, y)| < M for all (a, y). The plan of the proof 
is to show that £(x, y) is bounded both above and below by its maxi- 
mum and minimum on (0 Sz SL, y = 0); that £.(z, y) is bounded 
above and below by its maximum and minimum on either (0 S$ x S L, 
y =has) or OStSL, y =h.), (a = 1, 2, 3), where we define 
ho = 0; and that &4(x, y) is bounded above and below by its maximum 
and minimum on (0 S$ x S JL, y = hs). Then we show that the global 
maximum and minimum must occur on the electrodes. 

First consider £o(a, y) in the case of single-level metalization. Then 
£o(x, y) is harmonic in the strip S) = (0S 2S L, —~< y S$ 0), and, 
by the Phragmen-Lindel6éf theorem [Ref. 15, corollary to theorem 19, 
Chapter 2, with w(z, y, = 1 — y], £0(x, y) is bounded in So, both above 
and below, by its values on the lines (x = 0, -~<yS0),y¥srSsL, 
y=0), and @=L, —~~<ys0). Let m = ant £o(x, 0) and 


My = sup £&(x, 0). [Note that since £(z, 0) is continuous, there exist 
O<z<L 


points 0 < 2m, ty SL such that mo = £o(%m, 0), Mo = £0(rm, 9).] 
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Further, we must have 
fo(@, y) = 3Bo + SE (Bn cos nt + Yasin race, (86) 
and hence 
: ; 1 1 ¢# 
lim §(0, 9) = lim g(L, 9) =560= 7 | tole, 0)dz. (87) 
yo— 0 yo —~ 2 L 0 


From (87) we can conclude that 


mo S lim &(0, y) = lim £&(L, y) S Mo. (88) 
yr yr—o 


Now if Mo is not the maximum value of (xz, y), from what we have 
just shown, and from the periodicity of &(#, y) in x, this maximum 
value must be assumed at two points (0, yo) and (LZ, yo), with 
—o<y) <0. Further, the outward directed normal derivative at 
these points must be positive (Ref. 15, theorem 8, Chapter 2); that is, 
— (0£&/0x)(0, yo) > 0, (0&0/dx)(L, yo) > 0. However, from periodicity, 
(0£0/0x)(0, yo) = (&0/0x)(L, yo), which is a contradiction. Hence M» 
is the maximum value of £(z, y) in So. The same reasoning applied to 
— £)(x, y) shows that mo is the minimum value in Sp. 

In the case of double-level metalization, the maximum principle for 
harmonic functions (Ref. 15, theorem 2, Chapter 2), plus the boundary 
condition £)(x, —h_1) = 0, implies that £(x, y) is bounded everywhere 
in (0S 2S L) X (—h1i Sy S 0), both above and below, by its 
values on the sides (x = 0, —hi Sy 3X0), OS2SH ZL, 0), and 


(c = L, —h_1 S y S$ 0). Then the same reasoning as in single-level 
metalization shows that £o(2, y) achieves its maximum and minimum 
on(O Sets L,y =0). 

Essentially the same arguments used in the double-level metalization 
case can be used to show that &.(x, y) (a = 1, 2, 3) must achieve both 
its maximum and minimum either on the line (0 S$ z S LZ, y = ha-1) 
or (0 S23 L,y = h.), where we define ho = 0. 

The Phragmen-Lindeléf theorem can be applied to £&(a, y) in 
S:= (0 S$ 2S L) X (hs S y <) to show that &(2, y) is bounded 
everywhere in S, both above and below by its values on (x = 0, 
hsy<o)OS52¢SL,y =hs), and(e = L,h; S y < ©). Making 
use of the boundary condition £4(z, ©) = 0, the same arguments used 
in the case of single-level metalization for f(x”, y) show that &(x, y) 
achieves its maximum and minimum values on (0 S$ « S LZ, y = hs). 

We have shown that £(z, y) must assume its maximum and minimum 
values at points on the lines (0 Sx S L,y = ha), 0 Sa@ S 3. Suppose 
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that é(z, y) is a global maximum at the point P = (2, h.). Then clearly 
Wa(x, y) and Posi(2, y) take on their maximum values at P. Conse- 
quently, their exterior normal derivatives at P must be positive (Ref. 
15, theorem 8, Chapter 2), i.e., (@~./dy)(P) > 0, — (OWas1/dy)(P) > 0. 
However, if P is not a point on an electrode, it follows from inter- 
face conditions (14), (15) with Q(«) = 0, (16), or (26) that either 
n(Opa/dy)(P) = (OPar1/dY(P) or (dpe/dy)(P) = (OWats/dy)(P), 
which is a contradiction. Consequently &(2, y) achieves its maximum 
M, on an electrode. The same argument applied to —£é(z, y) shows 
that &(x, y) also achieves its minimum mo on an electrode. If we set 
M = max(|mo|, |Mo|), then we have shown that |&2, y)| S$ M. 
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Error Rates of Digital Signals in 
Charge Transfer Devices 


By K. K. THORNBER 
(Manuscript received June 21, 1973) 


We calculate the probability of error in detecting digital signals trans- 
ferred through a charge transfer device in the presence of incomplete charge 
transfer, random noise in the device, and detection uncertainty in the 
detector. The coefficient of incomplete charge transfer 1s assumed to be 
independent of charge-packet size, and both the device noise and detector 
noise are assumed to be Gaussian. Error probabilities for two-level and 
four-level codes are computed for the cases of both simple static and 
optimum dynamic detection. For rms detection voltage level fluctuations 
Va of the order of tenths of volts (much larger than the random noise 
fluctuations in the device), a very rapid increase in error probability 
(from =10-” to %10-*) is found to occur for a very small (20 percent) 
change in Va. This indicates that detection level fluctuations will have 
to be held down to a few hundred millivolts at most. To achieve equal error 
rates with an error probability of about 10-“, Va for the detection of 
four-level codes will have to be about 3.5 times smaller than for two-level 
codes. Comparison of error probabilities under static and dynamic 
detection shows that in CTD’s improved detection has a greater potential 
for reducing error rates than tmproved coding. 


I. INTRODUCTION 


As a packet of charge is transferred through a charge transfer 
device (CTD), the size of the packet is altered owing to effects of 
incomplete transfer!? and noise.*-® At the output the size of each 
packet is measured and, depending on its size, a decision is made as 
to the initial size of the packet. Usually the decision will be correct. 
However, occasionally the cumulative effects of incomplete transfer 
and noise will result in a sufficiently distorted charge packet that an 
error will be made. It is the purpose of this paper to calculate the 
probability of making such a detection error. When this probability is 
multiplied by the rate of detection (the clock frequency), we obtain 

1795 
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the error rate for a single device. Multiplying by the number of devices 
of interest in the storage unit or in the processing unit, we obtain the 
total error rate, a very useful quantity for evaluating digital systems. 
(By ‘‘detection,” we include regeneration; by a “single device,” we 
mean a single unregenerated line of transfer elements.) 

To calculate the probability of detection error we assume that the 
effect of incomplete transfer on the signal can be treated in terms of 
the usual small signal analysis.°" (The coefficient of incomplete 
charge transfer, a, is assumed to be constant, independent of the size 
of the signal.) Charge gain or loss because of leakage current is assumed 
to be sufficiently small that it can be ignored. The random noise which 
introduces fluctuations into the size of the charge packets is assumed 
to be Gaussian.* This is reasonable by the law of large numbers, since 
the size of a charge packet is typically 10° elementary charges. In the 
numerical calculations, only shot noise at the input and thermal noise 
induced during charge transfer are considered, as these are the most 
important sources of noise in good devices.® In addition, the detection 
levels are assumed to fluctuate with Gaussian statistics. This simulates 
(z) the fluctuation in detection levels from device to device, (77) the 
uncertainty in the location of the boundary between two decision 
regions, (227) the uncertainty introduced from nonideal regeneration, 
and (7v) the fluctuations induced by the coupling of the clock lines to 
the output. In a future paper, we plan to treat several of these effects 
more carefully. 

For our numerical work we take the position that probabilities of 
error of about 10-" are of greatest interest. Values much higher would 
necessitate more-often-than-daily correction of a multimegabit store. 
Attention is focused on how large a fluctuation can be tolerated in the 
detection levels, so that the probability of error is in this region for the 
cases of two-level and four-level digital codes. In addition, the error 
probability is also examined as a function of the number of charge 
transfers. Similar calculations are made for the theoretically minimum 
possible error rate, which can be obtained using a dynamic detection 
scheme.” Comparison of this absolutely minimum error probability 
with the error probability obtained using conventional (static) detec- 
tion suggests that a substantial improvement in error rate is possible 
using dynamic rather than static detection levels.” 


II. PROBABILITY OF ERROR 


In previous work,!-” it has been shown that in the absence of noise, 
after (n+ 1) transfers, each characterized by a coefficient of in- 
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complete transfer a, a charge packet of some initial size Q; has at the 
output a size Q(z) given by 


QU) = (1 a)"Q; + Qs, (1) 
where 
n+l . N 3 n N 
Qu = 1 — ays F(a On, (2) 


and where Qy is the initial size of the Nth packet preceding Q,;. In the 
presence of noise, the probability P(Q — Q(z))dQ that the observed 
size Q of the packet is Q(z) to within dQ is given by 


PLQ — Q(z) ]dQ = exp{ — [Q — Q(t) P/(24Q’)}/(27AQ?)*dQ, (3) 


where AQ? is the mean-square fluctuation in the size of the charge 
packet at the output resulting from noise (see Appendix A). If the 
range of Q over which the packet will be detected as Q; is given by 
Q7 <Q < Qj, then P;, the probability of error in detecting a specific 
Q(z) packet, is 


P= [“ Pra-ewyot [" Pre-ewwe 


To determine error probability, P; must be averaged over all possible 
Q(z) for each 1. 

The quantities Q; and Q;* can be readily determined by rewriting 
(1) and (2) in the form 


Q() =Q+ QQ: — Q)(1 — a)" + Qs, (5) 
where 
Qe = (a ays & (Ne Qn — @). (6) 


In (5) and (6), Q is the (time) average size of a charge packet. (For 
example, if two packet sizes, Qi: and Qo, are used equally frequently 
in a two-level digital code, then Q = (Qi + Qo)/2.) If now we average 
eq. (5) over all possible preceding sequences of packets, then we obtain 
for (Q(z)), the average size of Q(z), 


(Q()) = Q+ (Q:- Q)(1 — a)", (7) 


since the average Q; of Q; is zero. (Note that Qv = Q by definition.) 
The deviation of Q(z) from (Q(i)) is simply Qz, independent of 7. 
[Q(z) — (Q(z)) = Q;.] By extending the results of a previous treat- 
ment” of two-level coding to the multilevel coding considered here, it 
follows at once (see Appendix B) that the theoretically minimum 
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possible error rates will be achieved for Q; and Q;' given by 


a (Q(i)) + QE+D) 
t))-r a+ 1 ; 
(Note: Qa, = Qj. For an M level system, i = 0, 1, ---, M—1. 


For completeness we define Q(—1) = — © and Q(M) = +~.) These 
results apply even if the coding levels are not equally spaced. However, 
it should be obvious that if each size packet is used equally frequently, 
then equally spaced levels will result in the least probability of error. 

In previous work” we have referred to the detection scheme which 
utilizes detection levels determined by Qz (that is, by the preceding 
signal) as a dynamic detection scheme. In other words, by subtracting 
out the incompletely transferred portion from the preceding signal 
prior to each detection (achievable under noiseless conditions), we 
can select detection regions which null out the seatter in the signal- 
charge size induced by incomplete transfer. Since random noise cannot 
be nulled out, a lower limit is placed on the error probability. 

Using (8) and (9), we now compute the minimum error probability 
Pmin of a single detection and average this over all possible preceding 
signals to obtain the minimum error probability (Pmin). Let p; be the 
relative average frequency with which charge packets of initial size Q; 
are used in the code. Then using (4) we may write 


M-1 
P rss = »~ pb; 
i=0 


M~-1 QF -a) 0 

=Zr(f2 P@dt fi, P@a)- ao) 
If we note that [Q7 — Q(2)] = — [(Q(2)) — (Q(i — 1))]/2 and that 
LQ* — Q()] = + CQ + 1)) — (Q() 1/2, then Pmin becomes 


M-1 —LQ@))—-(QG —-1))1/2 
Pan = > wef P(Q)aQ 


+ f~ P(Q)dQ): (11) 


+L(QG+1)) —(Q(@))7/2 


—oO 


As mentioned in the preceding paragraph, Pmin is independent of the 
foregoing charge packets. Thus (Pmin) = Pmin. AS (Pmin) 18 the 
minimal, or optimal, error probability, we will use it as a touchstone 
to compare other detection schemes. 
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Complete dynamic detection as discussed above is one extreme in 
detection. [Boonstra and Sangster* have operated a CTD utilizing 
a partial lowest order (in na) correction. ] The other extreme in detec- 
tion is to ignore completely the sequence of charge packets preceding 
the packet of interest and to attempt to detect without compensating 
for the accumulated background charge. Since the average of Q; is 
0, one would then choose for Q; and Q;* the following 


Qr = [(Q()) + (Q@ — 1))1/2 (12) 
QF = LQ@ + 1)) + (Q@)I/2. (13) 


In this case, the error probability P associated with a specific detection 
event becomes 


M-1 —[(Q(i)) -(QG—1))1/2-Qp 
P= d w(f P(Q)dQ 


+ f° ge PDQ) (14) 


+L(QG+1)) —(Q())1/2-O5 


and 


To calculate (P), the average error probability, we must average (14) 
over all possible preceding signal sequences. Unlike Pmin, P is a func- 
tion of the preceding sequence through Q;. In the remainder of this 
paper we shall focus attention on calculating (P). 


III. NUMERICAL METHOD 


Let us assume (2) that we are using a multilevel (/-level) code in 
which each size of charge packet is used equally frequently (so that 
pi = 1/M), and (iz) that the levels of charge are equally spaced. 
Let S? = (Q(¢ + 1) — Q(z) ]/2. Then from (14), it follows that 


a —[st-(1—-a)n#+Q; 
(py = 2 poe" Pad. (15) 
[Note: If n and a@ are such that |Q;| > S! for some sequences of 
charge packets, then errors are made with this detection scheme even 
in the absence of noise. Thus using the detection scheme characterized 
by the Q7 and Q;* given by eqs. (12) and (13), it is essential that n 
and a be such that |Q;| < S? for all possible sequences of packets. 
Thus, (S? + Q;) > 0, and (P) < 1.] 
For numerical calculations, it is expedient to use (3) to rewrite 
(15) as 


—(S/N)4(1 +a)9+ (1+) 
(P) = 2(1 = a) / can, (16) 
—00 T 
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where S/N, the signal-power-to-noise-power ratio, is given by 


— {L0G +1) — Q@)]/2}? 
S/N = AG? ; (17) 
and where 2 is given by 
z= ei a ‘) a¥ Ty. (18) 
vai\ N 
In (18) each Jy[= (Qyv — Q)/S*] is a random variable, which for M 
even can take on the values +1, +3, +5, ---, (JM — 1) with equal 
probability, and for M odd is 0, +2, ---, +(M — 1) again with equal 


probability. To evaluate (16) in this form it is now necessary to average 
the integral in (16) over all possible sequences J1, J2, J3, °°. 

In this paper, we focus attention on that range of n and a which 
will probably be of greatest device interest—na <1. In this case, 
only the first few terms in 2 will contribute significantly to its total 
value. By ‘“‘significantly,’”’ we mean, of course, that whether 
Jy = + (M — 1) or Jy = — (M — 1) for fixed Jy, ---, Jy-i and 
0 = Jyy1 = Jv42 = +++, makes an acceptably small (say, 0.1%) 
difference in the values of the integral in (16). Thus, we can proceed 
as follows. Evaluate the integral in (16) for J; equal to each of its 
possible values and 0 = Jz = J3 = ---, sum, divide by M, and 
multiply by 2(1 — 1/M). This gives a first estimate of (P) which we 
call (P)1. Now again evaluate the integral in (16) for all possible pairs 
of Ji, Je with O = J; = J, = ---, sum, divide by M?, and multiply 
by 2(1 — 1/M). This gives a second estimate of (P) which we call 
(P)o. In general, (P)2 > (P)i. If ((P)2 — (P)1)/(P1) is within the 
desired accuracy, then we may stop here. If (P). differs significantly 
from (P)i, we calculate (P)3 in the obvious way and compare to 
(P)s, etc. For the numerical results presented in the next section, 
(P)s is as far as it is necessary to calculate to obtain 0.1 percent 
accuracy. For na < 1, convergence is guaranteed. 

Often knowledge of the error probability to within a factor of 2 is 
adequate for design purposes. Thus, computing can be greatly facili- 
tated if use is made of the following result. If A > 1, then 


D/2 < I(A) < D, (19) 
where 
I(A) = / galt (20) 


and . 
D = exp(— A?/2)/A. (21) 
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In the following section, our results for error probability are somewhat 
high, as D has been used in place of J(A) in evaluating (16). 


IV.'NUMERICAL RESULTS 


We have calculated both the minimum error probability (Pin) 
[eq. (11) ] using a dynamic detection scheme [eqs. (8) and (9) ] and 
the error probability (P) [eq. (15)] using a static detection scheme 
[eqs. (12) and (13) ], both for two-level and four-level coding. In all 
cases, we have taken a = 10-3, storage capacitance C = 0.1 pF, 
detection capacitance Cpz = 0.1 pF (see Appendix A), Qo = C-(4 


STATIC DETECTION 


DYNAMIC DETECTION 


64-BIT 
FOUR-LEVEL 


STATIC DETECTION 


DYNAMIC DETECTION 


LOG19 (ERROR PROBABILITY) 


64-BIT 
TWO-LEVEL 





0.1 0.2 0.3 0.4 0.5 
RMS DETECTION FLUCTUATION IN VOLTS 


Fig. 1—Error probability as a function of the root-mean-square fluctuation in the 
detection level voltage for static and dynamic detection schemes of two-level and four- 
level coding in a 64-bit device. 
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volts), and Qy_1 = C-(10 volts). All calculations were carried to 
0.1 percent. 

In Fig. 1 we have plotted error probability (P) (static detection) 
and minimum error probability (Pmin) (dynamic detection) for two- 
level and four-level coding as a function of the root-mean-square 
detection-voltage fluctuation Vg. For the two-level results n = 128, 
and for the four-level results n = 64. Both these cases correspond to a 
64-bit device. It is quite clear from Fig. 1 that to achieve an error 
probability of about 10~“, for two-level coding Va < 0.3845 V, whereas 
for four-level coding Vz < 0.105 V. This means that, to be able to use 
four-level coding, we must have significantly better control of detection 
voltage fluctuation than is necessary with two-level coding. 

We might imagine that a trade-off could exist which would favor 
four-level coding. For example, only one-half the number of transfer 
stages are needed with four levels as compared with two levels. Taking 
a inversely proportional! to C, for four levels we can double C and 
thereby cut a in half relative to C and a for two levels. As a is reduced, 
the role of incomplete transfer is reduced as well. However, for Va 
= 0.35 V, detection noise dominates the random noise. Thus S/N is 
practically unchanged as C' is varied [see eq. (24) in Appendix A]. 
In addition, S/N for four levels is so small (8) that (P) goes only 
from 6.8-10-? for a = 10-* to 4.3-10- for a = 0.5-10-%. Of course, for 
smaller Vz the change would be more drastic, as S/N would be larger. 
However, for smaller Va, two-level operation is enhanced as well. 

In Fig. 2 we have plotted the error probability of a two-level code 
as a function of the number of transfers for three different detection- 
level fluctuations for both static and dynamic detection schemes. In 
Fig. 83 we have plotted the same quantities for four-level coding and 
lower detection-level fluctuations. The striking superiority of dynamic 
detection over static detection is evident. (The dynamic curves are 
not actually flat; they increase somewhat in the region shown and 
much more rapidly for na > 1.) 


V. CONCLUSIONS 


In this paper we have derived expressions for the probability of 
error in detecting the size of charge packets carrying digital informa- 
tion in charge transfer devices. Effects of both random noise in the 
transfer device and detection noise at the detector were included. 
Error probabilities were computed and compared for common, static 
detection and optimum, dynamic detection of two types of coding 
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Fig. 2—Error probability as a function of the number of transfers n for static 
and dynamic detection of two-level coding for three values of root-mean-square 
detection voltage fluctuation (0.30, 0.35, and 0.40 V). For given n, the corresponding 
device is an n/2-bit device. 


1804 THE BELL SYSTEM TECHNICAL JOURNAL, DECEMBER 1973 


~4 


—6 


FOUR—LEVEL 


—8 


—10 


-14 


STATIC DETECTION 
—16 


LOG19 (ERROR PROBABILITY) 


~20 DYNAMIC DETECTION 





0 20) 40: 60 80 100 120 
NUMBER OF TRANSFERS ni 


Fig. 3—Error probability as a function of the number of transfers n for static 
and dynamic detection of four-level coding for three values of root-mean-square 
detection voltage fluctuation (0.10, 0.15, and 0.20 V). For given n, the corresponding 
device is an n-bit device. 


schemes. In the region of primary interest here (detection noise much 
larger than device noise), it was found that the error probability is a 
very sensitive function of detection noise, varying 20 orders of magni- 
tude for a +20 percent change in the detection noise level. Also 
significant was the finding that, to achieve equivalent operational 
performance, the rms detection noise level in a device using a four-level 
code must be 3.5 times smaller than that in a device using a two-level 
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code. Thus, in designing circuits for digital signal detection, it will be 
necessary to focus primary attention on the detection level noise. This 
must be held to a few hundred millivolts at most. It was also shown 
how our dynamic detection scheme could maintain a very low error 
probability as the number of transfers, n, was greatly increased. 
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APPENDIX A 
Notse 


In general, we can write the mean-square noise charge, AQ?,, resulting 
from random fluctuations in the form’ 


AQ? = AQinpur(1 — a)?"*) + AQSrH sp(n + 1) 
+ 2AQ}pHre(n +1), (22) 


where AQjiput is the input noise contribution, AQ2p is the storage 
process noise acquired by a single packet during a single clock period, 
AQ?p is the transfer process noise acquired by a single packet during 
a single charge transfer, (1 — a)?(+» is the (square of the) attenuation 
from input to output, Hsp(n) is the compounding factor for storage 
process noise, and Hyp(n) is the compounding factor for transfer 
process noise. A derivation of eq. (22) and a discussion of the various 
terms therein are treated elsewhere.*-* For our purposes (na < 1), it 
suffices to let Hgp(n + 1) = Hrp(n+1) =n+1. (For na <1, in- 
complete transfer of the noise can be ignored relative to the noise 
itself. Thus after (n + 1) transfers, the accumulated noise is just 
(rn + 1) times the noise resulting from a single transfer.) We shall 
assume that AQ2p « AQzp and set AQ?2, = 0. For shot noise at the 
input, AQ?»u = qQ, where Q is the mean total signal charge 
(Qu-1 — Qo). For thermal noise, AQ7p = 3kTC. As it turns out, the 
exact details of AQ? are not essential because these random effects 
turn out to be much smaller than the detection level fluctuations 
discussed below. However, if these detection level fluctuations can be 
reduced, then eq. (22) is quite important, especially in the region of 
na = 1 where devices can operate using dynamic detection. 

There are two equivalent ways in which detection level fluctuations 
can be included. The more systematic way is to use AQ?, in place of 
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AQ? in eqs. (3) and (4) and then average over Q; and Q;* in (4) with 
the appropriate distribution for the detection level fluctuations. In 
this paper, we have restricted attention to a Gaussian distribution for 
these fluctuations. As the noise is also Gaussian, it follows from a 
straightforward integration that we can write eq. (4) in the form 
given in the text with 


AQ? = AQr + AQi, (23) 


where AQ? is the mean-square uncertainty of the detection level. 
[The second way is just to write (23) a priori. | Since some detection 
error will result from nonideal regeneration, once this can be more 
accurately simulated, a more careful analysis of detection uncertainty 
will be necessary. 

The uncertainty AQ? is generated by an uncertainty V? in the 
detection voltage. Thus, 


AQi = ConVi, (24) 


where Cpz is the capacitance of the detector. In our calculations, we 
have assumed that Cpz = C, where C is the elemental storage capaci- 
tance. If now AQ? > AQ?, then S/N & V°C?/V2C? = V2/V2. in- 
dependent of the capacitance. (Here V represents the signal voltage.) 
Thus, increasing C does not improve the signal-to-noise ratio (S/N) 
when detection noise exceeds random noise. 


APPENDIX B 
Minimum Error Probability 


It is a very general result, rederived in a previous work,” that if 
I(A) is defined by 


I(A) = [ * ede (25) 


and if the probability that A < 0 is 0, then 
(I(A)) 2 I((A)). (26) 
Thus any detection scheme for which 


i 2X 


[ rte-ewe= [* 


i (Qz -Q4)) 

Pag = [ P(Q)dQ (27) 
for each 7 (and the corresponding equalities for fg ---dQ) will result 
in the minima overall error probability. The choice given in eqs. (8) 
and (9) does this, as it makes Q@7 — Q(z) independent of the preceding 
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sequence of charge packets over which the average in (26) and (27) 
is taken. 


APPENDIX C 
Realizability of Dynamic Detection Scheme 


Before considering the realizability of the scheme of dynamic 
detection discussed in the text, a point of clarification is necessary. 
One reason for developing dynamic detection here is to see just how 
low we can, in principle, make the error rate. This we have done 
assuming Gaussian noise, linear incomplete transfer, and complete 
knowledge of the preceding signal. If we relax the last assumption, 
we must take into account the fact that our detection of the preceding 
signal may not be perfect and, therefore, a higher error rate may in 
fact be the minimal rate possible physically. This problem is more 
difficult and will not be attempted here. What is important to dis- 
tinguish, however, is the difference between “perfect”? dynamic 
detection, which provides a minimum error rate below which one 
cannot hope to achieve, and the actual error rate when employing 
dynamic detection, which as we shall indicate below is not appreciably 
larger than the minimum rate under operating conditions of interest. 
With this in mind, let us proceed to a consideration of realizability. 

In the absence of noise, the dynamic detection scheme is clearly 
realizable in principle. Knowledge of the preceding signal permits 
determining the background charge level Qs (resulting from in- 
complete transfer) operationally using eq. (2). This permits placing 
the detection levels so that the size of the charge packet to be detected 
will lie midway between these detection levels. Under noiseless condi- 
tions, this permits error-free detection which, in turn, provides the 
signal history needed to determine Qz for the next packet detection. 

In the presence of noise, one may ask whether the dynamic detection 
scheme envisioned in Section II is truly realizable. If, for example, an 
error is made in detection, then the detection levels may be shifted 
far enough away from optimum so that for the next packet the error 
probability will be greatly increased. Fortunately, as the argument 
below suggests, if the probability of making a second error immedi- 
ately following the first is small compared to unity, then the optimum 
(minimum) error probabilities presented in the text are only slightly 
increased (on the order of percents rather than order of magnitude). 

Consider a two-level code and the dynamic detection scheme in 
which it is only necessary to adjust the detection levels for the first 
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preceding signal [Qs = a(n + 1)Q:1, Qi = size of first preceding 
packet |. We desire the probability P.(z) that the 7th packet is detected 
correctly. Clearly, 


P.(t-+ 1) = Pet + 1[2)P(t) + Pee(t + 1|2)P.(2), (28) 


where P,(z) [= 1 — P.(z)] is the probability that the ith packet is 
detected incorrectly, P.<(t + 1|72) is the probability that the (¢ + 1)th 
packet is correctly detected given that the 7th was also, and P..(¢ + 1|7) 
is the probability that the (2 + 1)th packet is correctly detected given 
that the 7th was detected incorrectly. Noting that P.(i + 1) 
= P,(t) = P., we can solve (28) for P., obtaining 


P, = (1 + Pec(t + 1[t)/Pee(t + 1] T°, (29) 


where P.. = 1 — Pee. The error probability P.(= 1 — P.) which we 
seek is, therefore, given by 





P. = [1 + Poet + 1|2)/Pec(t + 1|2) J (30) 
Pe Pee 
Sp ep (31) 


Equation (81) follows if P..<«< P.., as will be the case for P. <1, 
which is the region of greatest interest. If now P.. < 0.1, then P, will 
differ from P,. [calculated in the text, eq. (11), as (Pmin)] by less 
than 10 percent, an insignificant change. Although we have not 
investigated P.. in detail, it is clear that P,, will be closer in size to 
(P) (eq. 15) corresponding to static detection rather than to (Pmin). 
However, what is important is that P.. can be as large as one-half 
without increasing (Pmin) by more than a factor of 2. Thus, the (Pmin) 
calculated here are not expected to be overly optimistic so long as the 
detection level need only be corrected on the basis of just the first 
preceding signal. For the present, this is the situation of primary 
interest. It should be kept in mind, however, that it is the random 
noise and not the incomplete transfer which complicates dynamic 
detection. With sufficiently low noise, we can in principle greatly 
reduce incomplete-transfer distortion without appreciable propagation 
of detection errors even when the detection level must be corrected 
on the basis of many preceding signals. 
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Stability of a General Type of Pulse- 
Width-Modulated Feedback System 


By R. WALK and J. ROOTENBERG 
(Manuscript received May 31, 1973) 


Because of its theoretical and practical interest, the stability problem 
in pulse-width-modulated feedback systems has received an enormous 
amount of attention. Much of the reported literature deals with highly 
approximate methods, and the exact approaches, based on Lyapunov’s 
direct method or functional analysis, are quite restrictive and do not 
easily lend themselves to systematic compensation or design. 

In this paper, a quite general PWM 1s considered, and a frequency 
domain stability criterion ts presented, yielding a geometric interpretation 
in the Popov plane. 


I. INTRODUCTION 


The stability of pulse-width-modulated control systems has been 
an active area of research since the early 1960’s. A variety of graphical 
and analytical approaches to the problem have appeared in the 
literature.t-* Aside from the approximate methods, the main contri- 
bution of the early 1960’s to exact stability criteria was in the applica- 
tion of Lyapunov’s direct method.*'* As is often the case, this approach 
yields conservative results and does not easily lend itself to system 
compensation. Input-output stability via functional analytic tech- 
niques was reported in Skoog’? and Skoog and Blankenship,? where 
conditions for the LZ; boundedness and continuity of the system opera- 
tor are derived for PWM systems (considered there to belong to a 
larger class of pulse-modulated systems, i.e., that class of modulators 
for which the input is sampled). One drawback to the above type of 
criteria is the lack of a simple geometric interpretation; e.g., a Popov- 
type condition. In Skoog’ a circle criterion is derived for PWM systems, 
operating in the ‘‘quasi-linear’”’ mode; that is, where the modulator does 
not saturate. In its exact form, however, the above condition is rather 
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difficult to apply (the radius of the circle is in the form of an infinite 
sum, involving an arbitrary parameter). 

In all the previous cases, the pulse-width modulator considered is 
the periodic sampling type, where the input to the modulator is 
sampled and the polarity and width of the output pulse is determined 
from that sample. This paper will consider a similar PWM which is a 
generalization (GPWM) of the so-called natural sampling type.°-"-* 
In this scheme, the input is compared to a repetitive reference wave- 
form, and pulses are emitted in accordance with some specified relation 
between the two signals. 

It is the purpose of this paper to develop a geometric stability 
criterion for the GPWM system. The main result of the paper is a 
frequency domain condition for the stability of a feedback system 
containing a GPWM (described below) and a linear plant that may 
be either lumped or distributed. The condition, similar to a Popov 
type, is interesting in that it allows a tradeoff between the slope of the 
stability line and its intersection with the real axis of the Popov plane. 


II. NOTATION 

In this paper we are concerned with measurable functions of a real 
variable defined on [0, ©). We consider the function spaces Lp(p = 1), 
where 


Ly(p € (1, ”)) = {«): [lew leat <a 
and 
Le = {2 ess sup|2()| < = 


The corresponding norms are defined by 


( [\z@ jrat) 


It @[lL. = ess sup| x(t) |. 


It @|Le(p € [1, ©)) 


and 


Also, we shall use the extensions!’ of these spaces, defined as: 


Ppeticn)= {2: [le@la<0, vPely, )| 
and 
Les = (v(t); ess sup|ar(f)|, vT €[0, ~)}, 


*In a very recent paper, V. M. Kuntsevich! has treated this type of modulator 
by the discrete version of Lyapunov’s direct method. 


STABILITY OF PWM FEEDBACK SYSTEM 1813 


where 
X(t), tT 


Xr(t) = 19 t>T° 


And finally stability will be interpreted to mean that, for all inputs 
belonging to the spaces of interest, the composite system operator is a 
bounded mapping of those spaces into themselves. 


III. SYSTEM DESCRIPTION AND ASSUMPTIONS 


Consider the feedback system of Fig. 1, where the output of the 
GPWM is: 


m(t) = 2X Mex{u(t — KTa) — w(t — KTa — rx) ]. (1) 


The constant M is the pulse height, u(t) is the unit step function, 
and Tz is the period of the modulator. Also ex £ sen [o(KTa) ]. 
Furthermore, if we define: 


wx(t) & [o(t) — exA(t — KTa)][u(t — KT.) 
—pl=- (KP UTI. wR eI.) 


where A (the slope) is a positive constant, then 


min {(t = KT): wx (t) _ 0, t Cc [KT a, (Kk + 1)Ta) ]} (3) 
Ta, if w(t) x 0, vi € [KT4a, (Kk + 1)Ta) | ° 


The above relations are illustrated in Fig. 2. 

From eqs. (1) and (3) we see that the GPWM is a causal operator 
mapping L,, into itself. Furthermore, it is interesting to note that the 
periodic PWM is derivable from the GPWM by inserting a sampler 
(operating every Ta seconds) and a zero order hold before the modu- 
lator, as shown in Fig. 3. Here the analog of eq. (8) would be 


A €x 
Ls; lo(KTa)| > ATa 


LINEAR PLANT 
G(s) 


Rete [oe = Flo(KTs)|, |o(KT2)| S ATa 





Fig. 1—GPWM feedback system. 


1814 THE BELL SYSTEM TECHNICAL JOURNAL, DECEMBER 1973 


TIME 


“ SLOPE=—A 





Fig. 2—Modulator definitions. 


which is, indeed, the functional relation between the pulse width and 
the sampled input of a periodic PWM. 

It is worth pointing out that various forms of the GPWM process 
could exist. The modulator may be one-sided (ex = + 1, vK) as, 
for example, in de power conditioning; it may emit multiple pulses 
period; or the reference ramp may be replaced by a symmetrical 
triangle or other similar waveforms. The results of this paper may be 
extended to any of these variations. 

With the foregoing, the following assumptions are also made (see 





Fig. 3—Derivation of PPWM. 
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Fig. 1): 


Al. U(t) is absolutely continuous on [0, ©), and U(t), U(t) 
€ Ii f) Le, where U(t) includes an external input and the 
zero input response of the linear plant. 

A2. g(t), g(t) € In (1) Le; g(t) = 0, t < 0, where g(t) is the impulse 
response of the linear subsystem. 


Normally, in input-output stability analysis the solution of the 
system is assumed to exist in the extended space under consideration. 
However, the constraints of the modulator make this unnecessary. 


Lemma 1: Under assumptions Al and A2, a(t) © Lye (p = 1, 2). 

Proof: From eq. (1), for any finite time 7’ € [0, ~) the modulator 
will produce a finite number of pulses. Thus m(t) € L,.(p > 1), 
which implies by virtue of A2 that so does c(t). Hence a(t) € Loe 
(p = 1, 2) by Al and the linearity of the L, spaces. 


IV. STABILITY 


The objective of this section is to develop a geometric stability 
criterion for GPWM systems. Conditions for the system response to 
belong to Ly(p = 1) will be derived, yielding a geometric criterion in 
the Popov plane. The result will require that the linear subsystem 
have a measurable impulse response, satisfying A2, and thus may 
represent either a lumped or distributed plant. The following extension 
of a result due to Euler’ will be useful in establishing the criterion. 


Lemma 2: If x(t) ts absolutely continuous on (0, T | for any T € [0, ©), 
then: 


2%, leks) = [leola 
+5 f leWla + 3CloO| + l27)|I, @) 


where N = [T/Tq]; 2.¢., the largest integer <T/Ta, and the derivative 
£(t) exists almost everywhere. 


Proof: For K = 0, 1, 2, -+:,N — 1, 


(K+1)Ta t K 1 d ie ri RT 
fee (m7 E~ 5) qleOla = wie 
1)Ta 


1 s¢K+b 
+le(K+)T)I1- 5 f |x (t) | at 


since both x(t) and (t/Ta — K — 3) are absolutely continuous on the 
interval.1® In the integrand on the left, we can replace K by [t/Ta] 
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since, on the interval, t/Ta — [t/Ta] — 4 differs from t/Tz — K — 4 
only on a set of measure zero. Now addition of the above for K = 0, 1, 
2, --:, N — 1 and then the expression 3[|x(0)| + |x(NT7a)|] to both 
sides yields: 


Ele =f lowla+ [OO ws lta - 8) 


d 
Fla(é)|dt + 3£|2(0)| + |e Ta) |. 
Noting that: 


NTa / t 1\ d 1 pNte 
ee fy aes ioted Say yaad < 
il (x | 7 | 5) S|e(t)|at < a \e(t) | dt 


and that NTa ST, we see: 


N 1 T 1 T : 
ler) Sz fl le@la +5 f le@lae 
+ §[|2(0)| + [2QVT%)|I. 
Q.E.D. 


Along with the above result, the following observation concerning 
the modulator will be of interest in what follows. 


Lemma 8: Consider Fig. 1. If m(t) © Ly for any p € [1, ©), then tt 
belongs to L, for all p € [1, ©]. 


Proof: Suppose m(t) € L;, for some p € [1, ©). Then 
[oimora = MPS rx <0 
0 K=0 


and, since M is a finite number, we see that, for any 7, 


lm (2d) [| 2, = M? & TR < © 


f—) 


and thus m(t) € L, for all p€[1, ©] [of course, m(t) C L. by 
virtue of (1) ]. 

With the foregoing we are now in a position to state the main 
result of this paper. 


Theorem: Consider the GPWM feedback system of Fig. 1. Suppose there 
exist two numbers, qi € Rt, go ¥ 0, such that: 


@ = > plla@lles + MGI, +190) — 7 and 


(i) Re CL + jugs)G@(jo)] = 7 vo © Rt, 
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where 


cles I * 9g (t)e-a#*dk. 
Then 
C(t) € Lp(p = 1). 


Remark: If U(t)€& Ly, as well, then the system will be termed 
LI, (1) Le (1) Ly stable (bounded). 


Proof: Consider Fig. 1. We note that condition (77) implies (see, for 
example, Ref. 17) that: 


in |) i “m(b | -m(t)dt + af m(t)é(t)dt 


< [ 2@-moa, vTE[0,~), (5) 


where #(t) = u(t) + qui(t). Now from the defining relations of the 
modulator (see also Fig. 2), the GPWM is an e-positive operator ;}8 L.e., 


ie o(t)-m(i)dt = 0*, yT E[0, ~). 
0 
Thus: 


. / Poe i ” m(te(t)dt 


< bs apa | bs me(at| 6) 


where Schwarz’s inequality has been used on the rhs of (5). Using 


mee 3 Wei 2 RE Hae RTA GO) ford, 4 


Mm x N 
— YretaM dD [lo(KTa+ rx)| — |o(KTa)| ] 
d2 K=0 K=0 


N } 
< Ula@|,[ Xr] 
in which we have used 


; hoe + a eS haere 7 fa 
Py gD) |o(KTa) | 


* Note on Lz this is not true for the periodic PWM. 
tIf the truncation time 7’ should occur during the Nth pulse, then, of course, 
tn = T — NTu. 
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Inequality (7) follows from the fact that o(t) is absolutely continuous, 
which will be shown below. 
Now in view of (2) and (8), 7x S |o(KTa + rx)|/A, and thus: 


M( aA if =) os re — M\a(t)|| 2, i re] 
S aM ¥ |o(KT:)|. (8) 


We observe that C(t) [and hence a(t) by Al, Section III] is ab- 
solutely continuous, since it is the indefinite integral of a summable 
(Lebesque) function. Therefore, Lemma 2 is applicable to c(t) and: 


x |o(KT.)| <q ff lewla 
K=0 0 


+5 |e(t)|dt + 4L|o(0)| + |o(NT.)|] 
_ (a u(t)| + 31 Jae 
lic Lay! at t 9 
fo (gel + 3161 )at + suplo@|. (9) 
Furthermore, 
[cola s Miglin, & a 
and ey 


lA 


[cola < M(|lg@ll2, + JOT, oki 
Using (9) and (10) in (8) then implies: 


wl Er) B] 


alla [ |u(t) | dé + an ju) |at) 
+ aM sup | (é)| “t 


IIA 


Ma) | Ly" 
4Z 


IIA 


aM( pellwOllat + sla Is.) . 
+ oat suploto|- + MIAO eo 
t20 


*It is a simple matter to show that, under the hypotheses of the theorem, the 
system is L,, stable and, since u(t) € L. sup let) |<, 


STABILITY OF PWM FEEDBACK SYSTEM 1819 


where 


4 = wd ~ Blg lla — SE Colla, + 190] + = 





N 
For Z > 0 [condition (z) of the theorem], we have S} rx S Q(in- 
K=0 


dependent of 7’) < ~, and thus m(t) € L», which implies by A2, Sec- 
tion III, that C(¢) does also, and the theorem is proved. 


Comments: (a) Condition (72) of the theorem is similar to a Popov 
condition for feedback systems with static, sector nonlinearities, 
although the GPWM does not strictly belong to that class. 

(b) The condition allows a tradeoff between the slope of the sta- 
bility line and its interaction with the real axis of the Popov 
plane. 

(c) Because of the constraints of the modulator, the modified linear 
plant does not have to be a strictly positive operator, as is 
commonly the case.¥ 

(d) Since the assumptions are sufficient to ensure that U(t) [and 
g(t) |] 0 as t— ~, the theorem also guarantees that o(t) — 0. 
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Theory of Minimum Mean-Square- Error 
QAM Systems Employing Decision 
Feedback Equalization 
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Decision feedback equalization is presently of interest as a technique 
for reducing intersymbol interference in high-rate PAM data communica- 
tions systems. The basic principle ts to cancel out intersymbol interference 
arising from previously decided data symbols at the receiver, leaving re- 
maining intersymbol interference components to be handled by linear 
equalization. In this work we consider the application of decision feedback 
equalization to quadrature-amplitude modulation (QAM) transmission, in 
which two independent information streams modulate quadrature carriers. 
Extending Salz’s treatment in a companion paper of decision feedback for 
a baseband channel, we derive the form of the optimum receiver filters via a 
matrix Wiener-Hopf analysis. We obtain explicit analytical expressions 
for minimum mean-square error and optimum transmitting filters. The 
optimization is subject to a constraint on the transmitted signal power and 
assumes no prior decision errors. The class of QAM transmitter and re- 
cewer structures treated here 1s actually much larger than the class usually 
considered for QAM systems. However, our results for decision feedback 
equalization show that, for nonexcess bandwidth systems, optimum per- 
formance ts achievable without taking advantage of the most general struc- 
ture. If the transmitter is required to have the conventional QAM structure, 
study of the time continuous system that gives rise to the sampled data sys- 
tem considered here demonstrates that under quite general assumptions a 
nonexcess bandwidth system is optimum. Finally, the explicit description 
of the optimum transmitting matrix filter follows from an information- 
theoretic ‘‘water-pouring’”’ algorithm in conjunction with the determination 
of the form of the points of maxima of a determinant extremal problem. 


I. INTRODUCTION 


Interest has recently intensified in receiver structures which hope- 
fully will permit higher data symbol rates than are possible with con- 
1821 
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ventional demodulator/linear equalizer structures having the same 
error probability. The decision feedback equalizer is an example of a 
receiver component that can have important performance advantages 
over a linear equalizer operating over dispersive channels with additive 
noise.1~? The basic structure of a decision feedback equalizer (DFE) 
is shown in Fig. 1. The function of the filter in the feedback path is to 
cancel ‘“‘postcursors”’ of the channel’s impulse response; that is, inter- 
symbol interference components arising from previously decided sym- 
bols. Thus, the job of the linear filter in the forward path is to minimize 
(according to some criterion) ‘‘precursors’”’ of the channel’s impulse 
response which cause intersymbol interference from future data sym- 
bols. Of course, there is a possibility of error propagation with this 
nonlinear feedback structure. We avoid this intractable problem by 
assuming that no erroneous decisions pass into the feedback filter. 
Thus, our results provide a performance lower bound. Earlier experi- 
mental studies indicated that error propagation is not a serious problem 
on some channels.*-4 

Price® (whose bibliography on the subject is extensive) has derived 
asymptotic formulas (allowing for an infinite number of equalizer taps) 
for error probability, optimum transmitter pulse spectrum, and com- 
munication efficiency for the ‘‘zero-forcing’’? DFE, which minimizes the 
noise variance at the DFE output subject to the constraint that the 
intersymbol interference is zero at the receiver’s sampling instants. As 
is the case for linear equalization, the mean-square-error (MSE) cri- 
terion is more general than the zero-forcing criterion. The MSE cri- 
terion minimizes the mean square of the total error (noise plus residual 
intersymbol interference) at the DFE output.?> Asymptotic results 
and illuminating calculations of performance for MSE-minimizing 
DFE’s are contained in a companion paper by Salz.” 

All previous theoretical studies of decision feedback equalization 
have assumed a “baseband” linear PAM channel model depicted in 
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Fig. 1—Basic decision feedback equalizer structure. 
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Fig. 2—Baseband channel model. 


Fig. 2. The transmitted waveform s(t) is 


s(t) = Dag — nT), 


where the data symbols {a,} are statistically independent discrete- 
valued random variables from a finite set and g‘»(¢) is some suitable 
transmitted pulse waveform. The channel output waveform is then 


rot) = Y aah it — nT) + n® (2), 
where the overall impulse response is 


h(t) = [ c)(r)g (t — r)dr 


and n“*)(t) is additive noise. This model is certainly valid for a real 
linear channel accepting every T second a pulse of the form ang ® (#). 
It is also valid for the important case where the linear channel c‘®? (¢) is 
actually the baseband equivalent of a passband channel when the 
modulation is either double-, vestigial-, or single-sideband. (See Ref. 
8, Chapter 7.) Of course, c® (¢) then depends on the carrier frequency 
and on any phase offset between the reference carriers at the modulator 
and demodulator. 

In this paper we extend the asymptotic DFE theory to the case of 
QAM (quadrature amplitude modulation) signaling, for which the 
baseband model of Fig. 2 is not sufficient. We summarize our results 
at the end of Section II. The most general QAM transmitter structure 
is Ulustrated in Fig. 3. Two independent data sequences enter a lattice 
network comprising filters with impulse responses gii(t), go1(t), giz(d), 
and g2e(t). Modulation is done with two quadrature carriers with fre- 
quency fo Hz. In practice, most QAM transmitters are specialized to 
the case giu(t) = gee(t); gai(t) = —gi2(t).* We call the class of trans- 


* Indeed, it is often assumed that gi2(t) and gai(¢) are zero. 
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Fig. 3—General QAM transmitter structure. 


mitters with this special structure the class of ‘‘passband” transmitters 
(®). We show in later sections that optimum performance is in general 
achievable by restricting the transmitter to this class or a simple 
variant thereof. It is worth noting that QAM systems with passband 
transmitters are mathematically equivalent to baseband PAM sys- 
tems, but with complex impulse responses and information symbols.*! 

For the most general QAM structure, the waveform s(t) is expressed 
in terms of two-dimensional vectors and matrices. Define the vector 
an to be the nth pair of information symbols, 


= (2) 2 


The most general QAM transmitter is characterized by the matrix 


filter 
_ (gu) gie(t) 
= (ry gat) @) 


Then the structure of Fig. 3 yields 


s(t) = (cos 2rfot, sin 2rfot) >> g(t — nT )an. (3) 


We assume that the data symbols are uncorrelated discrete-valued 
random variables with variance o2. Thus 


(ana) = o28nml, (4) 


where ( ) denotes expectation, | denotes transpose,* dnm is the Kro- 
necker delta, and J is the identity matrix. The transmitted power is 


* The symbol { will denote conjugate transpose for complex vectors and matrices. 
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then given by 


P= tim df" (mya = 36 [ [oh + ghd + oh + Old 


To 


2 
Ca 


arf. [| Gir(w) |? + | Gi2(w) |? + | Ger(w) |? + | Gee (w) |? Jdw, (5) 


where G;;(w) is the Fourier transform of g;;(¢). For future reference 
note that we can also write P as 
2 
Cul oF a) 


Gu( « ar a) 
2 2 
Gao + a) = Ga(o + or) |e 
= o2 tlt 2rn\t 2rn 
= BaP Lin 9 +e) Oo + to 


where 


2 


+ 


Pp o r{T 2 
- ET | 


—nr/T n 














+ 














a0) = (Guta) Gato) 


is the matrix frequency response of the transmitter. We use tr to denote 
the trace of a matrix. 

Later sections will show that without an initial assumption of the 
special passband transmitter structure the treatment of decision feed- 
back equalization for two- (and hence higher) dimensional signals is a 
nontrivial generalization of the baseband signal case. 


Ii, THE CHANNEL MODEL AND SUMMARY OF RESULTS 


The impulse response q(t) of any linear channel can be resolved about 
a center frequency fo: 


q(t) = ci (t) cos Zw fot — ce(é) sin 2x fot. (7) 


It is easy to show that the channel model of Fig. 4 yields exactly the 
above impulse response, and thus any linear channel can be conve- 
niently represented in terms of an arbitrary center frequency fo by the 
structure of Fig. 4. We note in passing that the so-called ‘in-phase”’ 
and ‘quadrature’ impulse responses ci(é) and ce(é) are often inter- 
preted as the real and imaginary parts, respectively, of the ‘‘complex 
envelope” of the impulse response q(é) with respect to the frequency 
fo. 


We assume that the low-pass transmitter impulse responses {g:;(¢)} 
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sin2 wfot sin2m fot 


Fig. 4—A passband channel model. 


are all strictly bandlimited to lie within the frequency interval (— fo, fo) ; 
otherwise, the system would suffer distortion from aliasing effects. 
There is then no loss of generality in assuming that the channel’s in- 
phase and quadrature impulse responses c(t) and c2(t) are also strictly 
bandlimited to this interval. 

With these assumptions, double-frequency terms disappear," and it 
is easily shown that the noise-free channel output 


[ase - dar, (8) 
where q(t) is given by (7) and s(t) by (8), can be written 


3 (cos 2x fot, sin 2a fot) >— im c(t — r)g(r — nT)dran, (9) 


where c(t) is the matrix 


= C1 (t) C2 (t) 
a) = (_2@ ato) ee 


the matrix g(t) is given by (2), and integration of matrices and vectors 
means integration of each entry. 

Consider receiver structures whose ‘‘front end” is the type shown in 
Fig. 5—sine and cosine demodulators followed by identical ideal low- 
pass filters that are strictly bandlimited to (— fo, fo) and whose outputs 
are labelled r(t) and 7(¢), respectively. This structure causes no loss of 
information, since any bandlimited input signal can be reproduced 
exactly if the outputs r(é) and 7(t) are multiplied by cos 27fot and 
sin 27 fot, respectively, and then added together. The function of the 
low-pass filters is to remove double frequency terms; it will turn out 
that the “front end’’ will be followed by a band-limiting matched filter, 
so that the low-pass filters are not necessary. 
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Fig. 5—Receiver “front end.” 


The low-pass outputs r(¢) and 7(¢) can be written in vector form as 


r) = (2G) )= gE Me nT )an + vO, (11) 


where the matrix impulse response /h(é) is 


h(t) = [ ME CeEEOT oe (12) 


and the components of the vector 


_ (ne(t) 
ro =(M0) 

represent additive noise. Assuming that the additive noise in the 
channel is white with double-sided power spectral density No/2, it can 
be shown" that n.(é) and n,(é) are statistically independent stationary 
zero mean processes; each is the result of passing a stationary white 
noise with double-sided power spectral density No through an ideal 
low-pass filter. Noise outside the signal bandwidth will be eliminated 
by a matched filter. Accordingly, we take the covariance matrix of the 
noise to be 

(v(i)vi(t + 7)) = Nol6(r), (18) 


where J is the identity matrix and 4() is a ‘“‘unit-area delta function.”’ 
The mathematical model for the transmitter and channel is now com- 
plete and is summarized in Fig. 6a. 

We remark in passing that linear modulation of a single stream of 
data symbols (e.g., single-sideband or vestigial-sideband modulation) 
constitutes a special case of this model. In that case, gio(t) = goo(t) = 0, 
and the receiver front end consists of a cosine demodulator with some 
phase shift 6, followed by an ideal low-pass filter. Then the overall 
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Fig. 6—(a) Canonical mathematical model of transmitter and channel. (b) QAM 
decision feedback equalizer structure. (c) Structure of the matrix filter w(t). 
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impulse response is a scalar function of time (which depends on the 
receiver phase shift 6), and hence all two-dimensional matrices and 
vectors in the present treatment would be replaced by scalar quantities 
(see Ref. 7). 


The following list summarizes our main results: 


(t) The optimum linear forward filter at the receiver for a given trans- 
mitter channel cascade, H(w) = C(w)G(w), is found to have the form 


Ee Nis? 
Const X H (w)/./®(e sie es aes | ; 


B(e-HeT) = zz H'(w He “7 )H(« 4 a) 


where 

se ia 
and ./*¢ denotes minimum-phase square root. This filter can be viewed 
as a matrix matched filter followed by an anticausal matrix tap delay 
line. (See Sections III and VI.) 

(17) For a given transmitter power spectral density: if a nonexcess 
bandwidth system (Section V) is required, an optimum transmitter is 
found and it is passband; conversely if the transmitter is taken to be 
passband, the optimum system is a nonexcess bandwidth one (Sec- 
tion IIT). 


(zit) Given a passband transmitter, the MSE (the sum of the mean- 
square errors of the two unquantized receiver outputs) is 


T r{T | Ca XY Nl 1 | J 
| 21 i No ( ) 
where 


Xealw) = Lu 


n 


2 





a1(« 4 at) 4 JGh( « a “mt 
Cr(« + Fe) + jC4( + 7] 


and G1(w) _ Gi1(w) _ Go2(w) and G2(w) = Gi2(w) = — Goi (w) (Sec- 
tion VI). 

(tv) The optimum transmitter power spectral density is found for the 
class of passband transmitters meeting an output power constraint 
(Section VII). This optimal density has a water-pouring description. 
(Since the processing capability considered here represents an advance- 
ment over conventional linear equalization, this emergence of an in- 
formation theoretic type density is perhaps not surprising.) 





2 


x 
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(v) Although we do not constrain the in-phase and quadrature mean- 
square errors to be equal, we show that for the above-mentioned opti- 
mal systems the errors on the two data streams have equal variances 
and are uncorrelated (Section VI). 

In a nutshell, the system optimization proceeds as follows: 


(z) Find the optimal receiver for each transmitter. 
(27) Find an optimal transmitter for each transmitter power spec- 
tral density. 
(zit) Find the optimal transmitter power spectral density. 


Then we reverse, using the solution of (277) to specify an optimal trans- 
mitter and then using this optimal transmitter to specify the optimal 
receiver. 


HI. THE RECEIVER OPTIMIZATION PROBLEM 


The DFE structure consists of a linear matrix filter w(¢), quantizer, 
and a transversal feedback filter with matrix tap coefficients {b,} which 
processes previously made decisions as shown in Fig. 6b. The kth 
sampled vector input to the quantizers is written 


ye = / eek Dares Ss bak (14) 
—00 n=1 


where 4, is the receiver’s decision on the nth data symbol-pair. Note 
that we allow the feedforward and feedback matrix filters to have 
infinite-duration impulse responses. We also replace 4,_, in (14) by 
the true data symbol vector ax_, for mathematical tractibility; thus, 
we in effect postulate a “magic genie” preceding the feedback filter 
who corrects any decision errors. The genie’s existence is immaterial 
up to the time of the first decision error, and hence our expression for 
MSE is certainly valid up to that time. 

The error vector ¢, is defined to be the difference between y; and the 
correct symbol a;z, 


&, = Yr — ak, (15) 
and the MSE is defined to be the trace of the error matrix eo, where 
€o = {ent}, (16) 


the average being with respect to the noise and the data symbol se- 
quence. Note that eo is positive semidefinite and symmetric. 
Substituting (14) and (15) into (16) and using the noise correlation 
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matrix (13) and the data symbol correlation matrix (4), we can write 
eo = 08D / i w(rdhl (k — n)T — riJhtL(k — n)P — 19] 
ne —o J —0 


x wl (r2)dridrT2 + Nef - w(r1) wl (72)6(71 = T2)d71d72 


fe) 


+o +o Y E ss te w(rA(nT — nar | 


n= 


| bs 2 a wirh(n? — nde | = cif” w(r)h(—r)dr 


— ht(—r)wt(r)dr. (17) 
We immediately observe that tr eo is minimized with respect to the 
matrices {b,} if and only if for all n = 1, bn» = Sn, where 


Sn = w(r)h(nT — r)dr alln (18) 
represents the matrix samples for n = 1 of the impulse response of the 
transmitter/channel-receiver filter combination. Then, once the {b,} 
are optimized in this way, the remaining terms comprising the matrix éo 
can be written 


o&=a 2X bol — i w(rh(nT — “dr | 


X| rol = ye w(h(nt — rar |! aN i. w(awt(a)dr. (19) 


We wish to minimize tr eo with respect to the entries in the matrix 
w(t). Notice from eq. (16) that tr eo is a positive quadratic form. Thus 
from Ref. 12 we set the gradient equal to zero to determine the sta- 
tionary points which are necessarily points of global minima. We shall 
find that there is only one solution. 

Proceeding with the calculus of variations method, we replace 


wi (t) Wi2(t) 
wD Gay =O) 


i ) eoma(l 
wr Te. 


€oinai(t)  €a2ma2(t) 
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where the 7:;(t) are arbitrary. Setting 


Otr eo Otr eo 
den Oe} (0 0 eee eh 
Otréo dtr =(, 3 at €11 = €12 = €21 = €22 = 0, 


O€21 0 €92 














we get 
— oth" (—1)-+08 & i * PNT = sds Or ee) 
- + Now(r) = [0] 


or 
wr) = > wrht(nT — 7), (20) 
ns0 
where : 
Ca oe 
Wn = N, (6nol — Sn) nO. (21) 


This means that the matrix filter w(t) can be interpreted as a matrix 
matched filter with impulse response h'(—t) followed by a sampler and 
matrix transversal filter with matrix tap coefficients {w,}. Note that 
the transversal filter is ‘‘anticausal’’—that is, w, = [0]forn > 0. The 
structure of w(é) is illustrated in Fig. 6c. 

Furthermore, substitution of the optimum filter (10) back into ex- 
pression (19) for the error matrix eo results in 


€éo = o3(I — 8o)t (22a) 
and from (21) 
€o = Nout. (22b) 
An explicit solution for the optimum tap coefficient matrices {w,} 
can be obtained by postmultiplying (20) by h(m7T — r) and integrat- 
ing, using (21) and (18) and the definition 


=f" ht(—r)h(nT — r)dr 
to yield = 
2 Wn| nam + Me inn = Snol for n Ss 0. (23) 
ms0 a 


We recognize eq. (23) as a classical Wiener-Hopf equation for which 
we are assured the existence of a unique solution.” 

We attach a plus (minus) subscript to any matrix sequence whose 
value is the zero matrix on the strictly negative (positive) integers.* 


*A matrix sequence w,, zero on the strictly negative integers, is referred to as 
causal. A sequence wu-_, zero on the strictly positive integers, is referred to as anticausal. 
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By 1 we mean the matrix sequence vanishing everywhere but .zero, 
where the value is J. Then (23) is written 


wit*re=1 (nS 0), (24) 


where we have used » to denote the sequence sum which we observe is 
the Fourier coefficient sequence for a positive definite Hermitian matrix 
function whose determinant is uniformly above (No/c2). Hence » ad- 
mits a causal, anticausal deconvolution of the kind provided by Wiener 
and Akutowicz" (generalizing a result of Szegé). Based on Ref. 14, we 
can say 


A= uu, { (u_)n = (w+) re } n=0) (25) 


where we have used (u_), to denote the nth entry in the u_ sequence 
(similarly for u,). Corresponding to u_, we have its convolution in- 
verse, [w—]-!, which is also anticausal. 

From what we have just said, 


C(w+)oJ Lu uu, = 1, (26) 
and so w_ = [(u+)o]“!(w_)—! and, in particular, 
Wo = (uous), (27) 


where in the last equation the negative subscripts have been sup- 
pressed. Thus the anticausal transversal filter is found from eqs. (23) 
through (26) to have a frequency response inversely proportional to 


“¢ 
rere?) +27, 


where the notation ¥ ° means minimum-phase square root and 
&(e-#T) is the discrete Fourier transform of the matrix sequence {¢n}. 
Recalling eq. (22b), we have the following expression for the error 


matrix: 
éo = Now) = NoLuoud }-. (28) 


We remark that the development so far is analogous to that of the 
baseband decision feedback equalizer.® Further progress toward achiev- 
ing a closed-form expression for tre) thus depends on obtaining a 
closed-form expression for the matrix [woud ]-! or for its trace, corre- 
sponding to the result recently developed for the baseband case.’ It 
has not been possible to do this directly for the QAM case when the 
most general transmitter matrix is allowed. We shall prove under quite 
general conditions that the minimum of tr wo is achieved with a trans- 
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mitter of passband structure, and that, given this transmitter struc- 
ture, 


tr wo = 4vdet wo 


and 
r/{T 
det wo = exp { = i log det | Be jo) +. yet |d of, 
us —1/T 


where ‘‘det’’ denotes the determinant. 


IV. CLOSED-FORM EXPRESSION FOR DET €0 


For the most general matrix filter, attainment of a closed-form MSE 
expression for tr eo in terms of the matrix @(e-%7) has so far proved 
intractable. However, we shall see that such a general expression is 
unnecessary to describe the behavior of optimum systems. Our ap- 
proach is to employ the following easily proven lower bound for 2 X 2 
positive semi-definite symmetric matrices 


tr wo = 2|dettwol, (29a) 


which holds with equality if and only if wo is a scalar matrix (i.e., 
multiple of the identity). In this section we develop a closed-form 
expression for det wo. In the following sections where we deal with 
optimum systems, we can always perform the analysis in a context 
where eq. (29) holds the equality. 

We begin the analysis of Vdet wo by recalling (25a), from which 
follows 


det EG jo?) 4 —; No atl = det U_(e-#7) det U_(e”7). 
Then from the one-dimensional theory we have’® 
det (uous) = exp \F c log det EG dae Ms ss wT || ; 
and from (28) and (29a) 


Ww tr é)9 = tr wo 2 2vdet Wo, (29b) 
0 


where 


w/T 
det wo = exp {- ; log det | Be#") + No ue T || . (29¢) 
—r/T 
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Vv. FOR A NONEXCESS BANDWIDTH SYSTEM, PASSBAND TRANSMITTERS 
CANNOT BE OUTPERFORMED 
In this section we begin by expressing ® explicitly in terms of the 
transmitter and channel matrices. Then we define the notion of a 
nonexcess bandwidth system. The primary result of this section is that, 
for a nonexcess bandwidth system, if the transmitter power density 
function 


fey = ere (sf) 20,5 ["" foyda = P) 


is specified, then there exists a passband transmitter in the class of all 
matrix transmitters optimal under the constraint that f(w) is the power 
density function. 

To display the dependence of our results so far on the transmitter 
frequency response G(w), we first rewrite the matrix ®(e~##7) using the 
definition of ¢, as 


b(e-f0?) == : "hte Datla, pae: (30a) 
where 
KR(w, T) =D h(nT — rer", (30b) 
Expression (30b) is a Fourier series. Thus, 


T oprit | 
h(nT — 7) = 5 f pla, reid, (31) 


But the matrix impulse response h(n7’ — +) can also be written as 
the inverse Fourier transform of a matrix frequency response H (w), 


harass / H (wei? des, 
ys oe an 
which, upon splitting up the range of integration and changing the 
variable of integration, can be written 
a/T 


h(nT — 7) = ‘ erent 
Tv 


—r/T 


x |= A(e + —) exp |- ie + or) | [ao (32) 


Equating the integrands in (31) and (32), we obtain an explicit ex- 
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pression for 3C(w, 7) which when substituted into (30a) yields 


Bere) = 5 : Hw a “7 iH « a ): (33) 

Furthermore, denoting the Fourier transforms of c(t) and g(t) by the 

channel matrix C(w) and the transmitter matrix G(w) respectively, we 
can write H(w) = C(w)G(w) and 


T 
BT) = G(« a 7) 
Qarn\t Qn Qn 
x o(« 4 ) o(« 4 7 )a(« + mr) (34) 


A nonexcess bandwidth system is defined by the property that for any 
radian frequency w there is no more than one nonzero term in the above 
sum. It can be taken to be the n = 0 term by making a trivial fre- 
quency translation where necessary. Hence for a nonexcess bandwidth 
system 

B(eHT) = FE(w)'C)'C()E) (lol Sp) 5) 
In this section we deal exclusively with nonexcess bandwidth systems. 
In Section VIII we refer to a recent theorem of H. Witsenhausen which 
enables us to do a complete analysis of excess bandwidth systems by 
transforming them to a canonical nonexcess bandwidth “equivalent” 
and then transforming back. 

To model the class of transmitter frequency responses G (w), we intro- 
duce G to denote the (Hilbert) space of all 2 X 2 matrices whose entries 
are Hermitian symmetric {G(w) = [G@(—w) ]*} finite energy functions 
on (—7/T, «/T). The Hermitian symmetry of the entries is required 
so that each entry represents the Fourier transform of a real-time 
function. As in Section I, we use @ to denote the passband subspace of 
G consisting of matrices of the form 


(cee) eu) 


We shall be dealing only with matrix filters G of finite power P, given 
by (6). Thus we use Gp and @p to denote 





Lori? ty. . 2TP 
{¢|5: [BE W)G) do = = 
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in G and @, respectively. In the sequel, all transmitter filters will be 
assumed to have power P. 

We now optimize det (G'C'CG + (No/o2) J) at each radian frequency 
w for a fixed amount of power transmitted at w. 


Fix No >0, f>0 and C = ( a (Cs complex functions 


—C, Ci 
of frequency). Explicitly, we shall show that 


max det |ecrce + Ne 1| 


over all complex G such that tr G'G = f(w) is achieved for a G of the 
Gay a (For linear QAM systems, the same de- 
~Gie Gu 

terminant extremal problem arises in the optimum selection of a trans- 
mitter with a specified power spectral density function. To our knowl- 
edge, this aspect of linear QAM systems has escaped the literature.) 


passband form ( 


Notice the unitary transformation VW = 2i( : ) diagonalizes 
matrices of the form ( i ) in that 
—jb a 
;,f @ gb _ (at b 0 ): 
(a a) ( 0 ab ise) 
Since C'C is of the form a ) (a, b, real,a > b), if weletG = WB 
the problem becomes 
max det {2'(" : e 4 " »)B + Nol, tr BiB = f(w). 


a+b 0 
0 a—b 


max {det(B'DB) + Notr(BtDB) + No}, tr B'B = f(@). 


Let D = ( ) and rewrite the problem as 


At this stage we denote the Hermitian matrix BB‘ by Q and write 
max {detQD + NotrQD+ No}, trQ = fe). 


Of course, an optimum Q exists, since we are maximizing a con- 
tinuous function over a compact set. A nonzero off-diagonal entry in Q 
would only affect the determinant and not the traces. Since 
Q is Hermitian, the optimal Q is diagonal. Retracking, Q = BB" 
= VIG(WVt)Gtv. Now Q is positive definite and so has a positive 
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definite square root Q?. From the definition of B, 


G = vQivi 
which shows that the optimal G has the form 
Gu 7) 
= . 3 
es Gir ey) 


Although the proof is now complete, we go further and find Gi; and 
G2, as this will be used in the sequel. To find G11 and Giz we have seen 
that we must first find the entries qi: and q22 of Q so as to maximize 

{q11g22(a + b)(a — b) + NoLgqu(a + b) + qr2(a — b)] + No} 
on the triangle in the (qi, ¢22) plane described by 
Qitqe2Sf, qi 2 0,q2 2 0. 


Since a > 0 and a 2 Bb, the optimum (q11,¢22) is achieved with qu 
+ qo. = f. Let d linearly parametrize the segment joining (f,0) and 
(0, f) as shown below 


22 








solution on 
A= 1 this segment 


‘= 0 






f qi 
so (qi, G22) = L(1 — A)f, Af] where (0 S A S 1). 
The criterion becomes: Maximize 
f{(\ —aA)f(@ — B) + NLU —A)(a + b) + Ma — b)]} + NG, (88) 


which is a parabola concave in \. Our problem is to determine 
Nopt(O S Aopt S 1). Now the parabola is maximized at 


5 -L_ No a) 
=o Flig— 
If X satisfies 0 << < 1, then Aye =A. If A <0, Aap: = O and if 


r > 1, Revt = 1. 
So from G = WQ?¥1, we obtain 








Gs Clvd — Rood | eel 11 (39a) 
Gi. = ju aS Re signum b. (39b) 
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The determination of the signs attached to Gi and Giz was made by 
noticing that at each frequency 


det (crc'ce + r) (40) 


is invariant to the sign of Gi, while (40) is maximized if signum 6 is 
used for Gye. 


VI. CLOSED FORM EXPRESSION FOR ALL PASSBAND G@ 


We have seen that, for nonexcess bandwidth systems, an extremal 
G for 


det wo = exp |~ el i. log det | Ben) + No 1]e| 

Qa —r/T Oa 
exists in the space ®. Next we show that for each GG@, whether or not 
it has excess bandwidth, tr wo = 2Vdet wo. To do this we must show 
that wo is a scalar matrix. First observe that the matrices G and C are 
in @, and their entries, being Fourier transforms of real-time functions, 
are Hermitian symmetric. The matrix ®(e~%#7) + (No/o2)I, which is 
designated by ® and is the Fourier transform of the matrix sequence 
in (24), can be expressed in terms of the channel matrix C(w) and a 
passband transmitter matrix 


a -(_G0, GO) 


as in (34) to yield 


a aay) 


tr(o + Fr) Gi (o + ar) 
~sfo Be (o4 22 
Gh(0 +) G2 (0+ Fr) 


+@:(0+ FF) a (0+ FF) 


where 


me) = 9% 





2 





+ 





rs +3 (41) 
oa 
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and 


Ro(w) = = Em {|G (0 +H) cs (o + HP) 
* 

+0 (0+ 7) e:(0+7)| 

x | & (« +2) c1 (o + 2) 


Gy (« +2") C, (« +2) ]I. (42) 


The entries ®i and ®, are real functions of w. Ri is a positive even 
function and ®: is an odd function. The matrix @ is positive definite; 
i.e., Ri > 3. It is also Hermitian and passband. 

We have previously noted in eq. (25) that the matrix ® can be 
factored into the anticausal and causal matrices U_(e-#?) and 
[U_(e-#7)]', respectively. The matrix uouj, which is proportional to 
the error matrix inverse, is unique and the factor U_(e-") is unique up 
to an arbitrary unitary matrix post-multiplicative factor V. We now 
pick a particular unitary matrix. 

The matrix ® is diagonalized by the unitary matrix 


sy fe IN, ei eee es Oe 0 
v= al; Ns ie, VAY = ( ‘i een re (43) 
Now the entries ®1 — 2 and Ri + Rez are nonnegative real functions 
on —(7/T) Sw S (r/T). Since 
r{[T 
—-ao< log (Gi + Re)dw, 
/T 


—T 


we have from Szeg6’s theorem! that 


Ri — Re = |a_|? (44a) 
and 
Git Re = [6_|’, (44b) 
where a_ and $_ are anticausal functions of a, i.e., 
a_= > ane emt (45a) 
ms0 
and 
B_ = 2 Bue"; (45b) 
ms0 


the {a} and {8} being sequences of complex numbers. We can assume 
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ao and Bo are real and positive without loss of generality. Therefore 


ViRu = VVt, (46) 


a_ 0 
a (0 ;.) 
and since W is a unitary matrix, 


R = (WVWt) (Vit) 
ar ade (472) 


where 


where 


w= (2 t8, ile 8)) 

U_= wv oe Co ee) 
Thus in this factorization, U_ can be taken to be passband. Further- 
more, since ao and Bo are real, 


= ao + Bo j(ao — ie 
iam ea — Bo) art Bo ee 


is both Hermitian and passband, and so therefore is the error matrix 
€o = [uous]; 1e., its off-diagonal terms are purely imaginary.* But 
we know that eo, defined by (15), must have real equal off-diagonal 
terms, and therefore é) must be a scalar matrix. Thus wo = (1/No)éo is 
also scalar and 


tr Wo = 2Vdet wo. (49) 


Summarizing the development so far, we have shown that, for non- 
excess bandwidth systems, if the transmitted power spectrum is speci- 
fied, the passband transmitter structure is optimum. We then showed 
that if the transmitter has the passband structure, the MSE is given 
by eqs. (29b) and (29c), (29b) holding with equality. Incidentally, 
using the results of the last paragraph it can be shown that ao = Bo 
= VN)/2MMSE; hence wo is known. In Section III the optimum linear 
receiver filter w(t) was found up to the constant (matrix) multiplier 
uo '. For nonexcess bandwidth passband systems, we can now make the 
more complete statement that the matrix Fourier transform of w(t) is 


MMSE a4) vt iy Nol * 
4) ay, OCT) 2 


(where ¥ *~* means minimum-phase square root). 











“It is important to notice that, although uo 1S unique only up toa DOS UM Os 
tive unitary factor, the matrix wow is unique. 
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We shall now express (29c) in a somewhat different form, which 
avoids the use of the determinant. As pointed out earlier, @1(w) and 
®2(w) are real even and odd functions, respectively. Thus 


Rilw) — Re(w) = Ai(—w) + Re(—w) 
and 


w/t w/t 
i log [@i(w) — R2(w) |dw = i log LRi(w) + Re2(w) ]dw, (50) 


from which it follows that 


I 


tr €o 


T r{T 
2No exp | - - 7 ‘a log det (ode | 


2No exp (- + ‘ik log [®i(w) — Re2(w) Jdw 


T 


+f log [@1(w) + 6i2(«) Ho} ) 


= 2Noexp {- ee i log [@1(w) + f(s) Wo] (51) 


Substituting (41) and (42) into (51) gives the following expression 
for MSE: 


tr éo = 20% exp {4 — = i log ay (w) + 1 {dw (52) 
6 20 T Na : 


—7/ 


where 


Xeq(w) = > 





2 ‘ 2 
Gs (0+ Hr) + iG: (0 + 2H") 
C; (« he =) ers (« fy =) 


This expression is valid for any passband transmitter and, as shown 
in the previous section, it is valid for optimum general QAM trans- 
mitters with no excess bandwidth.* We show in Appendix A that, 
under very general assumptions, optimum passband transmitters will 
have no excess bandwidth. 





2 


x 








* We remark at this point that if we had restricted attention to passband trans- 
mitter structures from the outset, we could have derived the MSE expression (52) 
more directly by using the complex envelope notation referred to in Section II instead 
of the matrix formulation. 
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VII. OPTIMUM TRANSMITTER 


Here we continue under the assumption of a nonexcess bandwidth 
system. So far, we know that, if f = tr G'G is specified, an optimal 
passband G exists yielding a minimum for MSE and possessing power 
spectral density f. Our next step is to free tr G'G and to find G, which 
minimizes MSE subject only to the constraint that 


T r{/T T r/iT 
ae — — t — — 
7 Coat (= [Care ) P (53) 
Notice 
w{T . w/T 
[OIG + iG =f Gilt + [Gl) (54) 
—14/T —7/T 


since GiG: — GiG} is odd. Thus our problem becomes to find 
1Gi + 7G2|?, minimizing 


w/T . 2 F 2 
2exp — Fe = log (2 | Gy oe Jas 2 jC2| ir 1) 





: T rit : 
subject to — / |Gi + 7G.|? = P. 
2a —r/T 


It is shown in Appendix B that the solution to this problem is given 
uniquely by 


2 
JG + 3G2)* = (© — TSE [C1 + sCa[*) 





(where (£), 4 max[(é, 0)]), 


where @ is a constant set at a value so that 


T r/T 
Ff" (Ge + 1Gl) =P. 
WT J—n/T 


This solution also occurs in a related context in information theory, 
where it is dubbed “the water-pouring solution.’’!® 

Since (Gi(w)Ge(w) — Gi(w)G3(w)) is odd and f(w) is even, we average 
|@i(w) + j@a(w) |? and |@i(—w) + jG2(—w) |? to get* 


fla) = 5 {fe — MEF lexla) + ieate) >] 
NoT? 


oa 





es 
[Cx(—w) + s0x(—0) 1] } 
To find Gi(w) and G2(w), use the above f(w) in Section V. 





+[e- 


* Note that for No > 0, the optimum f(w) tends to a constant. 
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VIII. THE ROLE OF NONEXCESS BANDWIDTH SYSTEMS 


In the previous sections we have determined the optimum trans- 
mitter under the hypothesis that the system is nonexcess bandwidth. 
Here we point out that this hypothesis is not very restrictive. 

In systems in which the transmitter is required to be passband, it 
follows, under very mild assumptions on the channel characteristics, 
that the optimum transmitter (subject to an output power constraint) 
is a nonexcess bandwidth system. The mathematical proof of the 
optimality of the nonexcess bandwidth system is considered in detail 
in Appendix A. For an example, if for each w(|w| < 7/T) 


2rk 


\C1(w) + 7C2(w)| > C1(o+ 7H) + ies(o+ FF )| (ik 0), 





then the optimal transmitter has no energy outside 


|e 


For systems allowing any matrix transmitter, the question arises 
whether or not the optimal transmitter is passband. If the answer to 
the question is negative, the next question is whether or not the opti- 
mal transmitter is nonexcess bandwidth. The answers to these ques- 
tions depend on the system parameters, and there are channels for 
which the answers to both questions are negative. It is beyond the 
scope of this paper to give a detailed mathematical discussion of these 
more complex systems. Such systems are still under investigation, and 
so we shall limit ourselves to mentioning without proof some important 
facts concerning the analysis of such systems. 

The analysis begins by returning to Section IV fixing w and posing 
the extremal problem of 


max det (> GICIC.G. + Nol), 


subject to tr >> GiG, = f. If for each w it is optimal to expend all of 
f on one of the G;,’s, then we are in the line pursued in the previous sec- 
tions. However, to achieve optimality one may need to use more than 
one G. Indeed, H. Witsenhausen has solved this determinant extremal 
problem showing that at most two G;’s are required to achieve opti- 
mality, and there are instances where two G;’s are necessary. Even 
when two G;’s are needed, the wo matrix remains a scalar matrix and 
once again the trace and the determinant optimization are equivalent. 


The fact that two G;’s are required means the transmitter is excess 
bandwidth. 





lo — 2xfo| <3. 
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Witsenhausen has shown that when two G;’s are needed, one can 


be taken to be a multiple of = 1) and the other a multiple of 


¢ ) Although both G;’s cannot have the passband form, the 


& 2 ) matrix corresponds to a very simple structural variation of 
a passband filter. 

We mention in closing that systems whose optimization takes us 
outside the realm of passband structures can be analyzed via equivalent 
canonical nonexcess bandwidth passband systems. The equivalence is 
in the sense that MMSE versus P curves for the two systems are 
identical, and optimum design can be carried out in the canonical 
system and then transformed to the more complicated system. 
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APPENDIX A 
Optimality of a Nonexcess Bandwidth System 


Fix P>O and q(w) a positive continuous real function on 
(— ©, +), In the text we are confronted with the optimization 
x/T C) 
sup log | x r(ot 2 )a(o t+) +1 | ae, 
—7/T 1 T 


where the sup is over all nonnegative Lebesque integrable r(w) for which 
~+00 
/ r(w)do $< P>0. 


We show here that, under weak conditions on qg(w), the optimization 
problem can be replaced by an equivalent ‘‘nonexcess bandwidth 
problem,’’ namely, find 


+r/T 
sup [ log [r(w)g(w) + 1, 


where G(w) is a given continuous function and 7(w) is any nonnegative 
integrable function satisfying 


x{[T 
/ ANd PSO. 


—1/{T 
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Define G(w) on [—7a/T, 1/T ] to be the envelope sup q(w + 2rk/T). 
To avoid annoying pathologies, assume g(w) is such that for each w 


{el a(o+ Fe) = ae)| 


is not empty. Moreover, assume that (—7/7', 7/T) can be expressed 
as a disjoint union of subsets {Vm}? of total measure 27/7 such that 
on each Vm there exists a km SO 


a(o + ht) = Ho) 


holds uniformly in w on V,,. So G(w) is continuous on (—7/T, 7/T). 


Define 
m Qrkm 
v= 0 (vo +2), 


Given any r(w) = 0 satisfying ||r||1 = P, define p on (— ~,~) by 











2rkm = r(o +e) forw€ Vp» m=1,2,---,M 
p{owt+ 7 ae eae T 
0 wE VY. 
So 
00 0 a/T 
/ pdw = > p(o +r) de 
—«0 —o J—r/T f& 
r{T r/T 
= [0 er(otae a5 [oor (ot Ae ) ae =P, 
—r/T h k J—nr/T T 


where the second equality results from the definition of p and the third 
equality is from the Lebesque Dominated Convergence Theorem. Now 
for |w| < 7/T 


Er (ot ae )a(ot Se) sor (ot Fe ae) 
= He) Ep(o + Ht) = Fo (o+ Ya (ot ae), 


where the very last equality follows from the fact that p vanishes off 
V. Since in L[—7a/T, 7/T ] p(w) always fares at least as well as r(w), 
we have the fact that the supremum can be taken over the class of 
nonnegative functions vanishing off V. 

In the applications it often occurs that V C (—2/T, 7/T), in which 
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case g(w) = G(w) on V and the optimization problem becomes 


sup [ ie log [r(w)q(w) + 1 ]dw. 


Ilr] =P 


Even when V ( (—72/T, 7/T), we need only solve 


w/T 
sup log [¥(w)G(w) + 1 ]dw 
lal =P 2 —/T 

with the optimand “rearranged” to produce the desired r(w). The re- 
arrangement procedure is simply that, for each w € (—7/T, «/T), we 
define r(w + 2zkm/T) = 7(w). Elsewhere r(w) is defined to be zero. 
In dealing with even q(w), if V C (~7/T, x/T), the rearrangement 
produces an uneven r(w). When this occurs, [r(w) + r(—w) ]/2 pro- 
vides an even optimand. 


APPENDIX B 
Maximization of the Exponent Functional 


Let q(w) be a continuous positive function on an interval [a, b], 
Fixing a real number P > 0, Jet IT be the convex set of nonnegative 
continuous functions with integral less than or equal to P. We seek 
to maximize the nonlinear functional 


17) & flog @ + 10). 


This same problem occurs in classical information theory where, for 
reasons we Shall see, it is dubbed ‘“‘the water-pouring problem.” 
Although the solution is correctly described in the literature, the sup- 
porting arguments are formal (for example, see Ref. 16 or 17). We 
give a rigorous proof here, although our argument is not construc- 
tive in that the extremal function is ‘‘pulled out of the air.” To moti- 
vate the extremal function, the reader can turn to the references or 
supply for himself a variational derivation. 

Now I (v7) is concave on I, as we see by employing the Liebnitz rule 
to confirm the strict negativity of I’ [Ay1 + (1 —A)yz]onO0 SA S1 
with yi and y2 in T (differentiation is with respect to d). It is clear that 
if the extremal function exists it has integral equal to P and so we can 
redefine I to require equality of the integrals. 

For each constant @, the function (€ — q'), denotes the function 
equal to © — gq! when © — q-=! > 0 and equal to zero otherwise. Now 
JS (@ — ¢') is a continuous strictly increasing function of C with range 
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Fig. 7—Optimal power spectral density. 


[0, © ]. Fix @ so ((€ — q"), = P and call the resulting function 7. 
To show I(¥) is the global maximum of I(y) over I, let y1 denote 
any other function in I and let us investigate the segment 
{A¥ + (1 — A)y1,0 SA S 1}. Now I[DA¥ + (1 — A)y1] is concave in 
\ and straightforwardly 


I’DY + (L — A)Yydia-1 = C7 [P =p ee i ns} 
y=0 y>0 


which is nonnegative as Cg S 1. By definition, for a concave function 
the graph lies above any chord joining two points on the graph. So 
= 1 must be a point of global maxima of the segment. 

Also, ¥ is the unique point of maxima since, if there were another 
point of maxima 7, we would have J(y) constant on the line segment 
joining ¥ and ¥ contradicting the strict negativity of I’’. 

To understand the water-pouring terminology, look at Fig. 7 where 
we consider the graph of q~! with vertical walls based at [a, q~!(a) | and 
[b, g~1(b) | to be a vessel into which water of amount (area) P is poured. 
Relocate the w axis to the water level line. Then reflecting the water 
accumulation about the level line gives the shape of 7. 

We mention in closing that 7 is optimal in a larger set than IT ob- 
tained by requiring integrability rather than continuity in the defini- 
tion of the constraint set. The optimality over the larger set follows 
from a function space continuity argument. 
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A Universal Digital Data Scrambler 


By DAVID G. LEEPER 
(Manuscript received May 22, 1973) 


Analyses in the literature of digital communications often presuppose 
that the digital source is “white,” that 1s, that it produces stochastically 
independent equiprobable symbols. In this paper we show that it is possible 
to ‘whiten’ to any degree all the first- and second-order statistics of any 
binary source at the cost of an arbitrarily small controllable error rate. 
Specifically, we prove that the self-synchronizing digital data scrambler, 
already shown effective at scrambling strictly periodic data sources, will 
scramble any binary source to an arbitrarily small first- and second-order 
probability density imbalance 6 if (2) the source is first passed through the 
equivalent of a symmetric memoryless channel with an arbitrarily small 
but nonzero error probability «, and (21) the scrambler contains M stages 
where 


M = 1 + log, (In 26)/In (1 — 2e) ]. 


Some interpretations and applications of this result are included. 


I. INTRODUCTION AND SUMMARY 


Digital transmission systems often have impairments which vary 
with the statistics of the digital source. Timing, crosstalk, and equaliza- 
tion problems usually involve source statistics in some way. While 
redundant transmission codes may be used to help isolate system 
performance from source statistics, the isolation is not always complete, 
and such codes generate additional problems by increasing the required 
symbol rate or the number of levels per symbol which must be trans- 
mitted. In addition, with or without transmission codes, it is always 
easiest to analyze or predict system impairments if we assume that the 
source symbols are stochastically independent and equiprobable. We 
shall refer to such a source as “‘white’”’ because of the obvious analogy 
to white Gaussian noise. Methods for ‘‘whitening” the statistics of 
digital sources without using redundant coding generally come under 
the heading of scrambling. 

1851 
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We describe here a nonredundant scrambling/descrambling method 
which in principle will satisfactorily whiten the statistics of any binary 
source. The technique is based upon the self-synchronizing digital data 
scrambler. Savage has shown! that this device is very effective at 
scrambling strictly periodic digital sources. In this paper it is proven 
that the same device will scramble any binary digital source to an 
arbitrarily small first- and second-order probability density imbalance 
6 if (2) the source is first passed through the equivalent of a binary 
symmetric memoryless channel with an arbitrarily small but nonzero 
error probability «, and (zz) the scrambler contains M stages where 
M = 1 + loge [(In 26)/In (1 — 2e)]. In other words, at the cost of an 
arbitrarily small controllable error rate, one can ‘‘whiten’’ to any degree 
all the first- and second-order statistics of any binary source. This relaxes 
the restriction frequently found in the literature in which the source is 
assumed a priort to produce only independent equiprobable symbols. 
An auxiliary result is that the above relation for M is useful when 
designing a standard self-synchronizing scrambler for a given applica- 
tion. Heuristically speaking, the relation expresses the “‘power”’ of the 
scrambler by linking the “randomness” of the input and output to the 
scrambler length, M. 

In Sections II and III of this paper we examine some properties of 
scramblers, maximal length sequences, and mod-2 sums of binary 
random variables. With these discussions as background, we prove the 
main theorem in Section IV. In Section V we derive bounds for the 
autocorrelation of the scrambled sequence. Section VI contains some 
practical considerations involved in applying the theorem of Section 
IV. Beacuse they add insight, we give simple direct proofs for the 
lemmas and theorem of Sections III and IV. 


II. SCRAMBLERS AND MAXIMAL LENGTH SEQUENCES 


Figure 1 shows a five-stage self-synchronizing scrambler and de- 
scrambler.! As seen, both are linear sequential filters, the scrambler 
utilizing feedback paths and the descrambler feedforward paths. Each 
cell represents a unit delay. We restrict our attention to the binary 
case and use the symbols © and ® to denote mod-2 addition. Repre- 
senting the data as shown, we have 


by = a, ® br_s @ bys 
and 
ce = by ® be_s © da_5 = au, 


which shows that the descrambled sequence is identically equal to the 





OUTPUT (b) 


Fig. 1—(a) Five-stage scrambler. (b) Five-stage descrambler. 


original data sequence. The descrambler is self-synchronizing because 
the effect of a channel error, insertion, or deletion lasts only as long as 
the total delay of the register, five bit-intervals in this example. 

Let us consider the general scrambler of Fig. 2a with the input 
stream disconnected. Under such a condition, the scrambler becomes 
a sequence generator whose output must ultimately become periodic 
because (2) future states of the register are completely determined by 
the present state (the state of the register is the contents of its stages) 
and (iz) only the finite number 2” states are possible, where M equals 
the number of stages. One of these, the all-zeros state, simply leads to 
an all-zeros output. Discounting this state, we see that the longest 
possible period from the generator must be 2” — 1 bits. It 1s proven in 
the literature?* that with the proper choice of feedback taps we can 
generate such a maximal length sequence for any M. 

Registers which generate maximal length sequences make very 
effective scramblers because of their ability to dissociate one scrambler 
output bit from another. This property will enable us to show that two 
arbitrarily chosen output bits tend to be very weakly correlated. We 
state this essential property here in the form of a lemma. 


Lemma 1: From Fig. 2a tt is evident that each ‘‘b’”’ bit is equal to a lengthy 
mod-2 summation of selected ‘‘a’’ bits. Choose two bits, b, and b,, m > n, 
and define J mn to be the number of “a” bits which enter the summation for 
b, but not the summation for bn. That is, bm ts dissociated from b, by the 
mod-2 sum of Jmn “a” bits. 

Then, tf n > 2™+! (that is, the scrambler has processed at least 2¥*1 
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Fig. 2—(a) M-stage scrambler. (b) M-stage descrambler. 


“<q” bits), 
Jo = min [Jinn] = 22-1, (1) 


In other words, after a setiling time of 2@+ bits, any chosen pair of 
output bits will differ by the mod-2 sum of at least 2¥—! input bits. 
Proof: See appendix. 


III. MOD-2 SUMS OF BINARY RANDOM VARIABLES 


Throughout this paper we assume that a data sequence may be 
modeled as a sequence of binary random variables defined on a suitable 
probability space. In this section we state as lemmas two essential 
properties of mod-2 sums of binary random variables. Since the 
scrambler output is formed from mod-2 sums of input bits, these 
properties play a key role in determining the scrambler output charac- 
teristics. We include the proofs in the text because the equations 
involved will be useful later on. 


Lemma 2: Consider two independent binary random variables r; and 1». 
A third binary random variable r3 = 71 © re. Let 


pi = P(r; = 1) = 1 — P(r; = 0), z= 1, 2, 3. 


A UNIVERSAL DIGITAL DATA SCRAMBLER 1855 


Then 

[ps — | S min[|pe — 3], lpi — 31] 
with equality tf and only if pi or pe = 3, 0, or 1. In other words, rz is 
as close or closer to being equiprobable than either re or 13. 


Proof: Since r; and rz are independent, 


ps = p2(l — pr) + pill — pr). (2) 
Let 

d; = pi — } 421, 2,3. 
Then by substitution 


[ds| = 2[dil[de|. 


But since 
| d:| = oD 


we have 7 
[ds] S min[|di|, |d2|] 


with equality if and only if |di| or |d.| = 0 or 3. 
Corollary to Lemma 2: If p, = 3, then ps = 3 and rs ts independent of ro. 


Proof: Since rs = 71 © re, P(rs = 1|r2 = 1) = 1 — pi = §. But by eq. 
(2), ps = 4. Thus, P(rs = 1]r2 = 1) = P(r3 = 1) = 3, which implies 
r3 and fr. are independent. 


Lemma 8: Consider now a sequence of independent binary random 
variables {rz, k = 1, 2, ---} with 


Pir, = 1) =1— P(r. =0) =e forall k. 


We form the mod-2 sum 


Re = > Vr; (3) 
k=1 
and let P, = P(R,» = 1). Then 
P, = $11 — (1 — 2¢)"]; n2i. (4) 


Note that, as n >, P, converges to $ for all 0 < « < 1. However, 
we shall be concerned only with finite values for n. 


Proof: By applying eq. (2) repeatedly, it is easily shown that the 
sequence P,, satisfies 
P, = (1 — 2e)Pr-it ¢; n = 2, 


and 
P, =, 
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The solution to this first-order linear difference equation is given by 
eq. (4). 


IV. A UNIVERSAL DIGITAL DATA SCRAMBLER 


With the help of the lemmas, we may now derive the main result. 
We model the source as a device which generates a sequence of binary 
random variables {s,} with completely unknown statistics. Our goal is 
to find a scrambling/descrambling method such that the scrambled 
sequence {b,} will have statistics which approach those of the inde- 
pendent equiprobable (‘‘white’’) sequence {wz}. If we attempt to 
scramble {s,} directly as in Fig. 2, we are faced with a dilemma. The 
scrambler simply provides a one-to-one mapping between its input and 
output. As long as we have no knowledge or control of the statistics of 
{sz}, the statistics of {b,} must likewise remain unknown and uncon- 
trolled. Hence, the self-synchronizing scrambler alone cannot be 
universal. 

Instead of scrambling directly, we proceed as shown in Fig. 3. The 
source output is first passed through the equivalent of a binary sym- 
metric memoryless channel (BSC) with crossover probability « > 0. 
Remarkably, no matter how small e may be, this modification of the 
source sequence is sufficient to guarantee that the first- and second- 
order probability densities for {b,} will approach those of {wx} to 
within an arbitrarily small difference 6. The only requirement is that 
M, the length of the scrambler, be dependent upon the choice of « and 
6. This is the essence of the theorem which we derive below. (We note 
in passing that the descrambled sequence will now differ from the 
original source sequence by the error rate e, but since e may be chosen 
arbitrarily small, we assume for now that this is of no consequence.) 

To begin, we observe that because of the BSC the scrambler input 
sequence may be written 


a, = Si © rx, k =0,1, 2,---, (5) 
where 
P(r, = 1) = 1 — P(r, = 0) =e. 


From Lemma 1 we have seen that the action of the M-stage scrambler 
is to dissociate any chosen pair of bits (bm, bn) by the mod-2 sum of at 
least 2%—! “q” bits. Let us assume that b,, and b, are dissociated by 
exactly 2"—! “q”’ bits and that they are related by 


qM-1 


Bin = bn ® > ai. (6) 
=1 
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bx M-STAGE a, = s, Bry 
DESCRAMBLER 
(b) 


Fig. 3—(a) Universal scrambler. (b) Descrambler. 


(Here the subscript / is unrelated to the original position of a; in the 
scrambler input stream.) In what follows we show that P(b, = 1) 3 
and that b,, and b, are nearly independent. For these purposes the use 
of eq. (6) represents a worst-case analysis. By substitution from eq. 
(5) we may write 


Qu-1 QM-1 
bm = [bn @ 3) s|o| Bn (7) 

[1 1=1 

= A @® R 


where A and RF equal the first and second bracketed terms, respectively. 
Since the bits comprising R are independent from those comprising A, 
R is independent of A. Furthermore, by Lemma 3, 


P(R = 1) = 4[1 — (1 — 26€)2""). (8) 
Therefore, by Lemma 2, no matter what the value of P(A = 1), 
6 = |P(b, = 1) — $[ S 3[0. — 2e)""*] 8. (9) 


It follows that so long as « > 0 we may force 6 and 6’ to be arbitrarily 
small by choosing a large enough M. Specifically, for a given 4, 


In 26 


M 21+ log | a 55 


| ; O<e< }. (10) 
Since 6 may be made arbitrarily small, the density function p(b,,) may 
be made nearly white, and it follows that all first-order statistics of the 
scrambled sequence may be made nearly white. 

Our having shown P(b, = 1) & 4 does not by itsclf show that the 
source has been effectively scrambled. For example, consider a sequence 
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{tn} which consists of consecutive blocks of 100 symbols each. All the 

symbols in each block are alike; with probability 4 they are all ones, 

and with probability $ they are all zeros. Here P(z, = 1) = 3 foralln, 

yet the sequence has a very “nonrandom” nature. The implication is 

that to determine the effectiveness of the scrambler, we must also 

evaluate the statistical dependence between scrambler output bits. 
By definition, the variables b,, and b, are independent if 


Accordingly, we define the function 


A(Dm; On) = P(Dm, bn) — p(Bm)p(bn) 
= P(Dm| bn) p(On) _ P(Om)p(bn) (11) 


and show that the universal scrambler (Fig. 3) bounds the maximum 
value of |d(bm, bn) |. 

We do a worst-case analysis by assuming that b,, and b, are related 
by eq. (7). Further, we ignore the ‘‘s’’ bits appearing in eq. (7) because, 
being independent of R, they can only weaken the dependence between 
bm and bn. Hence, we may compute the maximum value of |d(bm, bn) | 
by assuming 

bm = b, © R. (12) 


From eqs. (8) and (9) we note P(R = 1) = 4 — 6 and for con- 
venience we temporarily let P(b, = 1) = b. Substituting these rela- 
tions and eq. (12) into eq. (11), we find that 


|d(bm, bn) | = 26L(1 — 6) J; 
for 
bm, 6, = 0, 1; in Soe. 


Hence, for b = 4 we obtain the general result 
dbs; Ore) max = 6/2. 


Since 6 may be forced arbitrarily small if M is given by eq. (10), it 
follows that any pair of output bits may be made nearly independent, 
and we may whiten to any degree all the second-order statistics of the 
source sequence. 

We may also show that the joint (second-order) density p(bm, bn) 
approaches that for the white sequence. The derivation of eqs. (6) to 
(9) shows that both the density p(b:;) and the conditional density 
p(b;|b;) must have values on the interval [(3 — 5), (¢ + 4)] for all 











M = NUMBER OF SCRAMBLER STAGES REQUIRED 





Fig. 4—Scrambler stages required as a function of ¢ and 6. 


possible values of b; and b;. Hence, 


(2 — 6)? S (bm, bn) = p(bm|bn)p(bn) S (2 + 4)’, 
or 


lA 


|p(bm, bn) — 4] SOt+ ORS, 
where 
Dy Be = OL 13 mon > 28, 


For the white sequence {w,}, we know p(wWm, Wn) = ¢ for Wm, Wn = 0, 1. 
Thus the joint density p(b,, b,) may be whitened to any degree by 
choice of 6, e, and M. 

The discussion above constitutes a proof of the following theorem. 


Universal Scrambler Theorem: A binary source with unknown output 
statistics is connected to a binary symmetric memoryless channel and an 
M-stage self-synchronizing scrambler as shown in Fig. 8. The channel has 
error probability « where 0 < « < 3. The scrambler output 1s represented 
by a sequence of random variables {b,, n = 0, 1, ---} and we define 
p(b,) to be the first-order and p(bm, bn) the second-order density functions 
for {b,}. Then for all6 > 0; m>n> 2™+1, and bn, bn, = 0, 1, 


and 
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provided that 


M 2 1+ log: | ual |: 


mi= 2 (15) 


Figure 4 shows the relation between M, e«, and 6. As seen, M is 
primarily dependent upon e. This may be clarified by rewriting eq. 
(15) for small values of «. We then obtain 


M = log: (1/e) + logs [In (1/26)]; € <3. 


The primary importance of this theorem is conceptual. To avoid 
inordinate difficulties, many analyses in the literature of digital trans- 
mission must assume a priori that the digital source is white. The 
theorem relaxes this restriction by showing that in concept the first- 
and second-order statistics of any source may be made asymptotically 
white. The practical application of this theorem is discussed in 
Section VI. 

V. AUTOCORRELATION OF THE SCRAMBLED SEQUENCE 


An important second-order statistic of the scrambled sequence is its 
autocorrelation. We define the autocorrelation as the expectation 


R(k) = ELbnbnsed, 


and for convenience we let the value of b, be +1 or —1. Clearly, 
R(O) = 1. For k # 0, we compute a bound on | E[babn+:_]|. Following 
the argument which led to eq. (12), we have 


|EDbnbnsel| S |ELba(bn © R)]I; nn+tke 21k <0. 
By definition, 
ELbnbm] = 2 LV tjP(bm = [bn = 7) P(b. = J). 
t ] 


We let 6, = 6, ® & and for convenience 
P(b, = 1) =b =1 — Pb, =—1). 
Substituting, the dependency on b vanishes, leaving us with 
ELba(bn, ® R)] = 26. 


R(k) = 1 for k =0, 
|R(k)| S 26 for k #0. 


Hence, 


(16) 


Note that, by forcing 6 to a small value with proper choice of M and 
e, this autocorrelation approaches that for a ‘white’ digital source 
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R(t) 





Fig. 5—(a) Autocorrelation for white sequence {wz}. (b) Autocorrelation bound 
for scrambled sequence {b;}. 


which has R(k) = 0, k # 0. This is shown graphically in Fig. 5 which 
shows the autocorrelation of ‘‘white’’ and scrambled sources for unit 
rectangular pulses. 


VI. PRACTICAL CONSIDERATIONS 


In practice, the binary symmetric channel required by the theorem 
might be implemented as shown in Fig. 6. The bit r; is a logic “‘one’’ 
only when the level from the noise generator exceeds some threshold. 
The threshold is set such that P(r, = 1) = e. The noise source need 
not be white, but values of n(¢) separated by the baud interval should 


NOISE 
GENERATOR 






DIGITAL 
SOURCE 


Fig. 6—One possible implementation of the BSC. 
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be independent. In principle, the combination of this simulated BSC 
and an M-stage self-synchronizing scrambler will form a universal 
scrambler capable of satisfactorily whitening the statistics of any 
binary source. Such a scrambling structure could be used wherever 
randomized bit statistics are essential and a small error rate can be 
tolerated (or perhaps corrected by an error-correcting code). 

There are, of course, good reasons to avoid actual implementation of 
the BSC. First, it may be difficult to generate the r;, sequence accu- 
rately if ¢ is very small. Second, the deliberate generation of errors, if 
not impractical, is at least unpalatable. Third, and most important, 
many commonly encountered sources do not need it. Self-synchronizing 
scramblers have been used successfully without any prior randomiza- 
tion of the source.‘ In this section we consider the operation of the 
scrambler without the BSC and show how a designer may use eq. (15) 
to estimate the required scrambler length for a given application. 

From eqs. (2) and (5) we deduce that the net effect of the binary 
symmetric channel in Fig. 3 is 


€ < p(ay|do, a1, °°+, Qe) <1 —«; a, = 0, 1, (17) 


for all k. In other words, because of the BSC there remains a small 
uncertainty as to the value of any “a” bit, even though all the other 
“q” bits might be known. As shown in the theorem, this and the 
dissociation property are sufficient to guarantee effective scrambling. 
Hence, if the designer knew to begin with that the source itself had the 
characteristic 


e< p(sx | So, S1,°'", Sk—1) <1- €5 s&s, = 0, 1, (18) 


then no BSC would be necessary, and eq. (15) could be applied directly. 
For example, bit streams encoded from analog waveforms (such as 
frequency-division multiplexed speech) often have such a property, and 
a value for e could be obtained from the coding rule and the amplitude 
distribution of the analog signal. 

For those cases in which a value for e cannot be computed, let us 
assume that the designer has at least some knowledge of the source 
pulse density. He could then proceed by estimating a nominal value for 
e and then decreasing the value to allow some margin. For example, a 
source which produces bit streams known to vary from 10 to 90 percent 
“ones” over short periods (say, several hundred bits) would have a 
nominal « = 0.1. It seems reasonable to allow at least one order of 
magnitude “‘margin’’ in the estimate, resulting in e = 0.01. Then from 
Fig. 4 we see that an eight-stage scrambler should be sufficient. 
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Of course, estimating e« from the source pulse density does not 
guarantee that eq. (18) really holds, but if the source sequence is not 
strictly periodic (the case covered comprehensively by Savage), it is a 
reasonable procedure. The point here is that even when we are un- 
willing to commit deliberate errors to guarantee fixed source statistics, 
we may still use eq. (15) to estimate how large a scrambler is required. 
Heuristically speaking, eq. (15) is an expression for the ‘‘power’’ of 
the scrambler, relating the ‘‘randomness” of the input and output to 
the number of scrambler stages. 


VII. CONCLUSIONS 


We have shown that at the cost of an arbitrarily small error rate it is 
possible to “whiten”? to any degree all the first- and second-order 
statistics of any binary digital source. This relaxes the restriction 
frequently found in the literature in which the digital source is assumed 
a priort to produce only independent equiprobable symbols. The key 
equation in our result [eq. (15) ] is useful when designing a standard 
self-synchronizing scrambler for a given application. 

We leave unsolved the problem of whether universal scramblers exist 
for the M-ary source. 
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APPENDIX 
Proof of Lemma 1* 


For convenience, we assume that in Fig. 2a the scrambler initially 
contains all zeros. Since each scrambler output bit is ultimately a 
mod-2 summation of selected input bits, we may write 


ho as: (19) 
k=0 


where the binary sequence fh; performs the selection. We note that if 
a) = 1 and a; = 0 for all 7 > 0, then {b,} = {h,}. But under these 


* Independently of the author, U. Henriksson has developed a proof of a similar 
lemma. 
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conditions, as described in Section II, {b,} will be a maximal length 

sequence. Hence, {h,} must itself be a maximal length sequence. 
Now we consider the two output bits 6, and b,. We wish to count 

the number of ‘‘a”’ bits which entered the summation for b, but not 


bn. We have 
bs => > hiGn—k Om > > hidm—k- (20) 
k=0 k=0 
(a) (b) 
Since m > n, 
m—n—1 m 
bm = BD hidm- OQ BW hidm—x 
k=0 k=m—n 
m—n—1 n 
= BD hidn~n © D AhmnpxAn—z (21) 
k=0 k=0 


Examination of the subscript range shows that all the “a” bits 
selected by the first summation in eq. (21) are unique to bm. By com- 
paring the second summation with eq. (20a) we see that the additional 
“a” bits which enter b,, but not b, are those for which 


Neeser = hiner = |. 
Hence, 


m—n—1 n 
ne = » hy + > [ inane = hincatelel, 
k=0 k=0 


or 


m—n—-1 n n 
Sanh = > hi poe Vig ake i pa Rin kh (22) 
k=0 k=0 k=0 


where addition is now in the usual sense. 
We examine this expression in detail, recalling that the sequence 
thi} has period p = (2% — 1) and the given condition n > 2¥*1, 


Case (1): If m— n = Kp, K = 1, 2, ---, then 
hi = Ngenen = Nin waghe 


for all values of k. Hence the second and third summations cancel. But 
then the first summation contains K periods of a maximal length 
sequence. Since each period contains exactly 2““- ones,® the first 
summation totals at least 2¢%-». 

Case (2): If m — n # Kp, then it is easily shown’ that the sequence 
formed by the term-by-term product Am—n:h, has period p and con- 
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tains 2“%-2) ones per period. The sequence {hm—n4.} contains 2(4-) 
ones per period. Hence the net contribution of the second and third 
summations is 2-2) ones per period. Since n > 2%*! > 2p, the 


summations cover at least two periods. Thus their net total is at least 
Qu-D, 


Thus for either case, 


Jo= min [Ja] = 20-1. 


m,n > aM tt 
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Dispersion and Equalization in Fiber 
Optic Communication Systems 


By D. M. HENDERSON 
(Manuscript received July 2, 1973) 


The additional optical power required at the repeater input in a fiber 
optic communication system due to intersymbol interference is expert- 
mentally measured. In the experiment, the intersymbol interference which 
results from differential mode delay in multimode fibers is minimized 
with a five-tap transversal equalizer. Error rate measurements are per- 
formed using five fibers ranging from 0.01 km to 1.25 km in length. In 
thes manner, the additional optical power required to achieve a given error 
rate 1s found as a function of pulse width. The measured values compare 
favorably with the power penalties predicted by Personick. The trade-off 
between excess optical power and equalization penalty in dispersion- 
limited fiber systems is discussed. 


I. INTRODUCTION 


The temporal spreading of light pulses in an optical fiber can impose 
a limit on the highest data rate transmitted by a fiber optic communi- 
cation system. Such spreading arises from differential mode delay 
in multimode fibers and material dispersion in both single-mode and 
multimode fibers.1 The fiber materials and geometry together with 
the type of light source determine the magnitude of each effect. In 
this paper, we report the measurement of the additional optical power 
required to compensate for the loss in sensitivity resulting from the 
need to equalize detected light pulses that experience mode-delay 
spread. The experiment was carried out to determine the feasibility 
and practicality of equalization in dispersion-limited fiber systems. 

In the experiment, light from a Burrus*-type gallium arsenide 
light-emitting diode (LED) digitally modulated at 48 Mb/s is coupled 
into a liquid-core fiber.* Intersymbol interference in the detected pulse 
train is reduced with a transversal equalizer? by forcing zero crossings 
in the pulse response at all sampling times but one. Error rate measure- 
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ments are made as a function of optical power on five sections of 
fiber ranging from 0.01 kilometer to 1.25 kilometers in length. From 
these measurements, the power required to assure a given error rate 
can be determined for the five pulse widths encountered. It is found 
that the measured power penalty due to intersymbol interference 
compares favorably with the values calculated by Personick.> The 
trade-off between excess optical power and increased repeater spacing 
afforded by equalization is discussed in the concluding section of 
this paper. 


II. DESCRIPTION OF EXPERIMENT 


A block diagram of the experimental setup is shown in Fig. 1. For 
a source we use a Burrus-type diffused junction GaAs LED driven 
with a 48-Mb/s pseudorandom pulse stream. The LED output is 
first collimated, then attenuated as necessary with neutral-density 
filters, and finally focused onto the input of a liquid-core fiber. For 
the half-duty-cycle, return-to-zero, input light pulses used in the 
experiment, a maximum of —14.4 dBm average optical power can be 
coupled into the fibers. 

In Table I, the measured loss and pulse width are shown for the five 
lengths of fiber used. The pulse width for the shortest section is deter- 
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Fig. 1—Block diagram of experimental setup. 
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TABLE I—MErEASURED Loss AND PULSE WIDTHS FOR THE 
Five Frpers TESTED 





Fiber length Differential loss Total loss Rms pulse width 
(km) (dB/km) (dB) (ns) 
0.01 27.8 =0 3.0 
0.50 27.8 13.9 7.5 
0.75 30.0 22.5 6.8 
1.12 25.5 28.6 10.9 
1.25 29.6 37.0 10.5 








mined by the input pulse width and the bandwidth of the RCA 
C30817 avalanche photodiode (APD) detector. The additional width 
that is observed for the remaining four fibers is due to differential mode 
delay. Note that the 0.75-km fiber has greater differential loss and 
exhibits less pulse spreading than the fiber 0.50 km in length. Higher- 
order modes are presumably more highly attenuated in the 0.75-km 
fiber. The 1.25-km fiber is obtained by splicing the two together. 

A high-impedance receiver is used, the first stage of which tends to 
integrate the detected light pulses. Incorporated in the receiver is an 
appropriate compensating network to assure that the receiver response 
is flat over the bandwidth of interest. Such a design has been shown 
to give improved signal-to-noise performance and reduced avalanche 
gain over a conventional, nonintegrating receiver.’ At near-optimal 
avalanche gain of 60, a 10-° error rate is realized with —53.7 dBm 
average optical power for the shortest section of fiber. 

In order to equalize the detected optical pulses of various widths 
and shapes, a five-tap transversal equalizer is utilized. The tapped 
delay line with one time slot between taps is realized with RG188 
coaxial cable. Variable gain and polarity of the signal picked off at 
each tap are achieved with MC1733 wideband differential amplifier 
integrated circuits. An additional wideband amplifier serves as a 
summing amplifier to recombine the signals. The equalizer is manually 
adjusted for each fiber tested. 

Figure 2 shows the eye diagrams of the output of the ‘‘Nyquist”’ 
filter both with and without equalization for the shortest section of 
fiber. Here the equalizer serves to modify slightly the combined band- 
width response of receiver and filter, giving an improved response. The 
eye diagrams for the 1.12-kilometer fiber are shown in Fig. 3. In this 
case, differential mode delay has resulted in significant intersymbol 
interference. Adjusting the equalizer for zero crossings results in the 
equalized signal shown. 
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WITHOUT WITH 
EQUALIZATION EQUALIZATION 


Fig. 2—Eye diagrams of input to regenerator with and without transversal equalizer 
for the 0.01-km fiber. 


The equalized pulse train is regenerated and then compared with 
the original pseudorandom sequence in an error detector. Error rate 
measurements are then made as a function of optical power for each 
fiber. Optical power readings are taken with a Coherent Radiation 
Model 212 power meter. 


III. RESULTS 


The shape of the detected optical pulse strongly influences the 
amount of intersymbol interference and accordingly the additional 
optical power required. To accurately determine these shapes, a 
wideband, low-noise, 50-ohm amplifier is substituted for the high- 
impedance receiver. Through signal averaging with boxcar integration, 
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Fig. 3—Eye diagrams of the input to regenerator with and without transversal 
equalizer for the 1.12-km fiber. 
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Fig. 4—Detected pulse shapes for the five fiber lengths tested. 


the pulse shape can be recorded with signal-to-noise ratios in excess 
of 100. The pulse shapes for the five fiber lengths are plotted in Fig. 4 
versus the normalized parameter t/T, where T represents one time 
slot (20.83 ns). From these data, the rms pulse widths o of Table I were 
computed from the relation o? = [/’f(ddt)] — Lftf@®dt}. Here 
f(t) is the measured pulse shape normalized to unit area. The inte- 
gration was numerically performed. In the figure, it is seen that for the 
shortest length, the pulse is confined to a single time slot. For the 
intermediate lengths the pulses are effectively confined to two time 
slots; for the longest lengths to three time slots. 

A compilation of the error rate measurements is shown in Fig. 5. 
We have not attempted to fit the best curve to the data for each fiber. 
Instead we show superimposed on the data a curve of fixed shape 
which minimizes the deviation from the mean for all the fibers. The 
measured values fall within +1/8 dB of the selected curve. 
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AVERAGE OPTICAL POWER IN dBm 
Fig. 5—Results of error rate measurements for the five fiber lengths tested. 


The dependence of required optical power on the detected pulse 
width and shape has been calculated by Personick.® In this calculation, 
the degradation in signal-to-noise ratio is found when equalizing from 
the detected pulse shape to a raised cosine shape. These results are 
presented in Fig. 6 in terms of the additional optical power required 
to maintain a fixed error rate. The dependence is shown for an expo- 
nential pulse (1/c)-exp(—t/c) and a Gaussian pulse (1/V2rc) 
-exp(—2?/207). When (¢/T) < 0.25 the power penalty for the two 
shapes is about equal. This behavior follows from the fact that little 
difference exists between the frequency spectra of the pulses over the 
range of interest (0 < w/2r < 1/7) in the narrow pulse limit. As 
the pulse width increases, the frequency spectrum of the Gaussian 
falls off much more rapidly than the exponential, resulting in a much 
larger power penalty. 

Also shown are the measured points taken from Table I and Fig. 5. 
As no measurement is performed in the limit ¢/7’— 0, the point at 
o/T = 0.15 has been assigned the value of 0.25 dB to coincide with 
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Fig. 6—The additional optical power required to maintain a fixed error versus 
normalized pulse width. 


the Gaussian curve at that point because the measured pulse shape 
appears Gaussian rather than exponential. All other points have been 
scaled upward by this amount. Scaling this point to the value for the 
exponential pulse shape would merely shift all points up by an addi- 
tional 0.12 dB and not affect the results significantly. 

The measured points do not fall on a continuous curve but rather 
define a range of values. A dashed line which bisects the measured 
points is shown. It is interesting to note that the 0.75-km and 1.25-km 
fibers which lie above the dashed line have a more symmetric pulse 
shape and less of a tail than the 0.50-km and 1.12-km fibers which fall 
below the line. Such dependence is expected because the presence of 
the tail leads to spectra which fall off less rapidly in the frequency 
domain and therefore will suffer less power penalty. Calculations 
confirm this dependence.*® 

These results point out the important role the detected pulse shape 
plays. The decidedly asymmetric pulses that result from differential 
mode delay in liquid-core fibers lead to power penalties that increase 
much more slowly with pulse width than the penalty predicted from 
Gaussian pulses. 


IV. DISCUSSION 


In order to point out the benefits and limitations of equalization 
in dispersion-limited fiber optic communication systems, we treat a 
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specific case below. In the example, optical pulses from an LED source 
are transmitted by a low-loss fused-silica multimode fiber. 
The fiber loss is considered first. Define 


R(dB) = 10 log [Pin/Preq(T0) ], (1) 


where Pin is the optical power coupled into the fiber and P;.q is the 
power required in the absence of intersymbol interference to assure a 
given error rate at the data rate 1/79. The distance L(km) over which 
one can communicate is governed by the fiber attenuation coefficient 
a(dB/km) according to 

R(T.) — al = 0. (2) 


Consider the fused-silica multimode fiber announced by Corning Glass 
Works’ for which a = 4 dB/km at 0.85 um. If we take for Preq the 
measured value of —53.7 dBm at 1 error in 10+° and assume that 
—15 dBm can be launched into the fiber, then the inequality in eq. (2) 
would hold for distances less than 9.6 km. Additional signal-to-noise 
margin that is required would reduce this value in proportion to the 
fiber loss. 

The required power is dependent on the data rate. For the high- 
impedance receiver it has been shown that® Preq Y(T)—"®. By defining 
7 = 10 log (T./T)*'®, eq. (2) can be written 


R(To) — aL 2 1(T)/T). (3) 


From this result the maximum separation can be found as a function 
of data rate. In Fig. 7 the curve marked 4 dB/km loss shows the 
dependence. The inequality is satisfied for distances L below the line. 

Personick, et al.,8 have studied the pulse spreading in such a low-loss 
fused-silica multimode fiber excited by an LED source. For a fiber 
with a tailored index-of-refraction profile which gives an effective 
numerical aperture of 0.12 and an effective diameter of 96 um, they 
find that material dispersion is the dominant pulse-spreading mecha- 
nism due to the large spectral width (400 A) of the LED. The measured 
rms pulse width is found to be o/L = o’ = 1.75 ns/km. 

If we do not equalize, but restrict the pulse width to avoid inter- 
symbol interference, then an additional limit is imposed on L. As an 
example, take the arbitrary restriction o’-L/T < 0.35. In Fig. 7, this 
inequality is satisfied to the left of the curve marked material dis- 
persion. Therefore, the area in the lower left-hand side bound by the 
two limiting curves shows the available working distances versus 
data rate. It should be noted that another possible way to reduce this 
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Fig. 7—Maximum repeater spacing versus data rate for the fiber system described 
in text. 


dispersion is to trade off optical power and LED bandwidth by 
reducing the spectral width of the output. We deal with equalization 
alone in this paper. 

At higher data rates, dispersion limits the repeater spacing before 
the fiber loss limit occurs. Below we consider the effect of using this 
excess power for equalization to increase the spacing. The power 
penalty has been presented as a function of pulse width in Fig. 6. 
Here o/T become o’L/T. Let p(L, T) expressed in dB represent the 
power penalty. Equation (3) then becomes 


R(T) — aL 2 r(T0o/T) + p(L, 1). (4) 


The solutions to this equation are shown in Fig. 7 for both the Gaussian 
and exponential pulses. At the data rate used in this experiment 
(48 Mb/s) the repeater separation is increased from 4.2 km to 
6.5 km for the Gaussian-shaped pulses. At that distance the rms 
pulse width is o/T 0.55. For exponential pulses the spacing could 
be as large as 8.4 km with accompanying pulse width o/T = 0.70. 
The exact distance will depend strongly on the detailed shape of the 
detected pulses as noted previously. In any case, increases of at least 
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50 percent would be possible. At higher data rates the potential 
increases are greater still. 


V. SUMMARY 


We have experimentally measured the additional optical power that 
is required to maintain a fixed error rate for digitally modulated light 
pulses that encounter differential mode delay in multimode fibers. 
The resulting power penalty versus pulse width compares favorably 
with the values predicted by Personick. We find that the asymmetric 
shape of the mode-delay-spread pulses results in penalties which 
increase much more slowly with pulse width than does the penalty 
predicted for Gaussian pulses. Each potential dispersion-limited fiber 
system must be examined separately to determine the potential 
increase in repeater spacing that equalization offers as the pulse 
spreading mechanisms vary significantly among different sources and 
fibers. The example presented shows graphically how one can utilize 
excess optical power in equalizing delay distortion and thereby 
maximize the repeater spacing for a given data rate. 
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Phase Dispersion Characteristics During 
Fade in a Microwave Line-of-Sight 


Radio Channel 


By M. SUBRAMANIAN, K. C. O’BRIEN, and P. J. PUGLIS 
(Manuscript received February 5, 1973) 


Measurements of phase and amplitude dispersion over a 20-M Hz band 
have been made on a 42-km, 6-GHz, line-of-sight microwave link. A novel 
technique 1s introduced for measuring the phase dispersion induced by 
the propagation path. Specifically, the amplitudes and relative phases of 
four tones separated equally by 6.6 MHz have been continuously monitored 
over a period of four months. The data show that there is usually measur- 
able (0.02 degree/(MHz)?) phase distortion over the 20-MHz band 
during those fades whose depth exceeds about 20 dB. These dispersive 
fades, which usually last a few seconds, typically occur along with shallow 
and essentially nondispersive fades that have durations of several minutes. 
However, only the dispersive fades exhibit a phase nonlinearity. Analysis 
of 16 events measured in the autumn of 1970 yield the following results. 


(1) The distribution curve describing the fraction of time that phase 
nonlinearity (quadratic) exceeds a given value follows a log- 
normal distribution. 

(zi) The quadratic phase nonlinear coefficient exceeds an average value 
of 0.1 degree/(MHz)? for fades with depth larger than 84 dB 
from the nominal level. This corresponds to a time delay distortion 
of 0.55 nanosecond over 1-M Hz band. 

(121) The correlation between log-amplitude and phase nonlinear 
coefficients yields a correlation coefficient whose magnitude is 
fade-depth dependent and whose sign varies from event to event. 


The experimental technique of measuring phase dispersion reported 
here may be of interest not only for propagation studies but also in other 
systems such as measurement of characteristics of electrical networks. 
The statistical results obtained on the phase characteristics may prove of 
interest in formulating an analytical model. Further, they may be of 
significance in the design of existing and future microwave systems. 
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I. INTRODUCTION 


Fading in microwave communication channels has been the subject 
of investigation by many workers for a considerable length of time. 
However, the emphasis on these studies has been more toward the 
behavior of the amplitude characteristics of the signal rather than of 
its phase characteristics. The purpose of this study was to investigate 
the phase characteristics of a microwave signal transmitted over a 
typical tropospheric, line-of-sight link.* Specifically, the following 
topics were addressed in the experiment. 


(1) Measurement of phase variations over a microwave radio 
channel, as a function of both frequency and time; obtaining 
from the experimental results statistics on the phase non- 
linearity in a microwave radio channel. 

(7) Correlation, if any, between the amplitude and phase distortions. 


Measurements ‘were made in late 1970 on a TH-8 radio channel 
operating between Atlanta and Palmetto, Georgia. The experiment 
was conducted as an adjunct to an ongoing study by other members 
of Bell Laboratories. The transmitter located in Atlanta and the front 
end of the receiver situated in Palmetto were common to both experi- 
ments. However, a different set of apparatus was employed for the 
measurement of amplitude and phase in the present study. G. M. 
Babler! has reported on the experimental layout of the microwave 
link. 

This paper addresses three major areas: experimental technique 
and arrangement, measured data, and statistical analysis. The experi- 
mental technique is a novel one in that it measures directly the phase 
difference between pairs of transmitted tones separated in frequency. 
Only the phase difference induced by the propagation path is measured, 
not the transmitter and receiver beat oscillator fluctuations. More 
specifically, at a carrier frequency of 6 GHz, four tones separated by 
6.6 MHz and with a definite phase relation are transmitted over a 
42-km line-of-sight path. At the receiving end, the signals are brought 
to a 70-MHz IF and filtered out. The amplitude of each tone is con- 
tinuously monitored. The phase difference between each adjacent 
pair is measured. 

Data were recorded during roughly 100 events in which the fade 


* As a result of the findings reported here, a more comprehensive, higher resolu- 
tion measurement program has been undertaken to characterize more completely 
the dispersive microwave channel. It is expected that a summary of these results will 
be published here in the near future. 


PHASE DISPERSION CHARACTERISTICS 1879 


depth exceeded 10 dB. These fades can generally be divided into two 
categories. The majority of them are relatively shallow (<20 dB), 
long-lasting (minutes) events which show little amplitude and non- 
linear phase distortion, although linear phase dispersion corresponding 
to path length variations may be observed. A second class of events 
are those that exhibit deep and brief (seconds) fades showing sub- 
stantial amplitude dispersion and nonlinear phase dispersion. This 
second class could present difficulties to communication systems. 

Twenty-six dispersive fading events whose fade depth exceeded 20 
dB were observed during the autumn of 1970, a smaller number than 
usual. Only 16 of these were analyzable as a result of equipment 
malfunction. This analysis yielded the following results. 


(t) The distribution curve describing the fraction of time that 
the phase nonlinearity (quadratic) exceeds a given value follows 
an almost log-normal distribution. 

(77) The quadratic phase nonlinear coefficient exceeds an average 
value of 0.1 degree/(MHz)? for fades that are deeper than 34 
dB from the nominal level. This corresponds to a time delay 
distortion of 0.55 nanosecond over 1-MHz band. 

(472) No simple relationship seems to exist for the correlation be- 
tween the quadratic log-amplitude and the quadratic phase 
nonlinear coefficients. 


II. DESCRIPTION OF PHASE DISPERSION MEASUREMENT 


This experiment measures the effect of the transmission path on the 
relative phase of signals at different frequencies. Specifically, a ‘‘picket 
fence’ of tones in the 6-GHz range, separated from each other by 
0.55 MHz, are transmitted over an approximately 42-km path. These 
tones are generated at the transmitter by means of a “‘picket-fence”’ 
generator developed by G. A. Zimmerman of Bell Laboratories. The 
experimental apparatus measures the phase difference between pairs 
of these tones separated by 6.6 MHz in such a way that only the phase 
difference induced by the transmission path is measured, and not the 
transmitter and receiver beat oscillator fluctuations. Although a closer 
spacing between the tones would have been more desirable, practical 
considerations limited the selection to four tones distributed uniformly 
over the 20-MHz radio channel. Further, at the time of initiation of 
this experiment, the available statistics? on the amplitude dispersion 
during deep microwave fading did not seem to indicate any significant 
fine structure over a 20-MHz band, and consequently no fine structure 
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of phase nonlinearity of significant magnitude was expected as a 
result of fading by multiray phenomena. 

To understand the measurement, it is important to trace a signal all 
the way through the system. A 70.4-MHz signal inserted into the 
picket-fence generator is divided by 128 and fed into a pulse generator. 
The resulting pulses are then mixed with the 70.4-MHz signal pro- 
ducing a picket fence of tones separated by 0.55 MHz and centered at 
70.4 MHz. In addition to having equal amplitude, the tones have the 
same initial phase. The time-dependent frequency error of the 70.4- 
MHz signal caused by oscillator fluctuations may conveniently be 
expressed as 128 Aw, (¢). The nth tone can then be written, with w, 
= 70.4/128 MHz, as 


Ao cos [N(w. + Awo)t |, (1) 


where A, is the amplitude of each picket. Up conversion at the trans- 
mitter by a beat oscillator with angular frequency wr yields 


Arcos ({Lwr + n(wo + Aw.) ]}t + or), (2) 


where A, is the amplitude of the transmitted up-converted signal and 
®@7 is the phase shift introduced by the beat oscillator. After trans- 
mission through the atmosphere, the signal becomes 


A7An cos {Lwr + n(wo + Awe) jt + or + $3}, (3) 


where A, and ®f are the amplitude and phase modulation introduced 
by the atmosphere on the nth tone and A, is the amplitude of the 
received signal without any modulation by the atmosphere. Down 
conversion at the receiver yields 


ArArA, cos {Lwr — wr + n(wo + Awo) lt + dr — or + 63}, (4) 


where ArA7An is now the amplitude of the IF signal, wz is the angular 
frequency of the receiver beat oscillator, and ®z is the phase shift 
introduced by the receiver beat oscillator. Consider that this IF signal 
is mixed with a signal from a standard oscillator (phase changes are 
included in Aw,) described by 


A, COS (ws + Aws)t. (5) 


We then have, for the upper and lower sideband signals (ignoring a 
constant), 


Sn = A,ArAzAn cos {Lwr — wr + n(wo + Aws) + (ws + Awe) lt 
+ or — or + gn}. (6) 
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In order to compare two tones (m and n) with phases 6% and 3%, 


n > m, choose 
Ws = (7 o (7) 


By appropriate filtering, S, and S, are isolated. After being mixed 
with w,, the lower sideband of S, denoted by Sn: and the upper side- 
band of S,, denoted by S,,,, are separated, yielding 


Sn 


Il 


A;ArArAn cos {[ er — wrt (*)e, + nAws — Aa, |t 


+ or — e+ 3} (8) 





Smu = AsArATAm COS |[ en —owrt € + Na + mAw, + aw, | 


+r — de + oil. (0) 
Measurement of the difference in phase of Sn: and Smu yields 


[(n — m)Aw, — 2Aw,s |t + of — of, = Ad. (10) 


If the first term is small, we have a direct measure of the quantity 
of interest. In this case, w, and w, are derived from similar highly 
stable sources (Hewlett-Packard 105B quartz oscillators and General 
Radio 1165 and 11638 frequency synthesizers). The phase drifts, 
12Aw,t and 2Aw,t [in eq. (10) corresponding to the selected value for 
(n — m) = 12 in the experiment ], are approximately —1 degree/day. 
Note that, if the frequency deviation is the same for both oscillators, 
even this small phase drift cancels out. Further, the transmitter and 
receiver oscillators are manually synchronized every day, except on 
Sundays. However, this small drift produces a running phase difference 
(which is linear with time) between tones and can be subtracted out 
during the data analysis by measuring the linear slope before and after 
a fading event, thus causing no error to the data. The short-term rms 
frequency deviations (<1 s) are on the order of 1 X 10-8/ms or 
1 X 10-"/s. At the frequency of 6.6 MHz, this corresponds to a rms 
phase noise of 0.024 degree. When the picket-fence generator was run 
directly into the phase and amplitude measurement system, the 
observed rms phase noise out of the network analyzers was 0.03 
degree. This is considerably smaller than the accuracy of the measuring 
equipment which is +1 degree. 
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Figure 1 is a block diagram of a simplified system. Of the entire 
assembly of pickets, those numbered ‘1” and ‘13’ are separated 
out by the narrow bandpass filters. Both w: and w,3 are mixed with an 
ws = (wi3 — w1)/2 derived from the same source. The second pair of 
filters labeled [(w1 + w13)/2] separates the appropriate sidebands. 

Figure 2 is a block diagram of the actual measuring system. The 
entire picket fence is received and divided into four channels. In each 
channel one tone is filtered through. The four frequencies used are 
53.35, 59.95, 66.55, and 73.15 MHz, spanning the desired range of 
approximately 20 MHz, the width of radio channel. The nominal 
level of the individual tones at the input to the system is approxi- 
mately —46 dBm. 

Each filtered tone is amplified and mixed with a signal of frequency 
ws from the standard oscillator. The output is then split into its two 
sidebands, thus producing three pairs of signals, each pair having the 
same frequency difference. 

The network analyzer (Hewlett-Packard Model 676A-H05, which 
is a modified version of 676A to meet our requirements) measures the 
relative phase (10 mV/degree), the amplitudes (50 mV/dB), and the 
ratio of the amplitudes of the two input signals. Amplitudes can be 
measured over a maximum of 80-dB range with up to 0.01-dB resolu- 
tion and +1.5-dB accuracy. Phase difference can be measured with up 
to 0.02-degree resolution and +1-degree accuracy if the tone levels are 
not too widely different: The measured results on the system indicate 
that for tones whose amplitude difference is less than 30 dB, which is 
well within the limits on requirements of our experiment, these specifi- 
cations were satisfactorily met. The network analyzer performs its 
amplitude and phase measuring functions at an IF frequency of 100 
kHz. The accuracies quoted above can be achieved only if this 100 kHz 
is stable to less than 100 Hz. Because of drifts in the beat frequency 
oscillators at transmitter and receiver, the frequencies of the signals 
at the input of the network analyzers drift (together) by a few hundred 
hertz. In order to maintain the network analyzer’s IF frequency 
constant, it is necessary to track these drifts. This is accomplished by 
sampling in each network analyzer one of the input signals, amplifying 
it to a constant level, mixing it with a 100-kHz signal, filtering out 
one (the upper) sideband, and employing this signal as a local oscilla- 
tor in the network analyzer. The IF strip in the network analyzer then 
operates at a constant 100 kHz. This arrangement is shown in Fig. 3. 
The de voltage outputs of the network analyzers are recorded on a 
7-channel FM tape recorder with a dce-to-625-Hz bandwidth. The four 
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Fig. 1—Basic schematic of two-channel phase and amplitude comparator. 
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amplitudes and three phase differences are recorded. Note that if one 
knows ®, — ®, and ®; — ®, one can compute ®3; — &, etc. 


III. DISCUSSION OF EXPERIMENTAL TECHNIQUE 


The experimental technique described here has several advantages 
and limitations compared with others used in the past.*—5 

The distinguishing characteristic of this technique is that the actual 
phase difference between two signals is measured. Thus, if we consider 
® = P(w), we measure A®/Aw rather than simply d’&/dw?, which is 
the delay distortion. Further, the tones are generated and measured 
at IF in such a way that the phase and frequency variations of the 
up and down converters do not affect the measurements. Hence, the 
same apparatus could be used at any desired carrier frequency. A 
knowledge of the envelope delay, d¢/dw, could be of use in the analysis 
of multipath fading of line-of-sight microwave link. 

The basic problem facing anyone wishing to measure phase differ- 
ences of the same signal reaching two widely separated receivers, as 
in very-long-baseline interferometry (VLBI) or, as in our case, of two 
signals generated at one place and received at another, is one of a time 
reference. One technique is to transmit a timing signal either over the 
air or through a cable from one place to another. This approach 
suffers from unknown variations in the signal path because of atmo- 
spheric changes in the broadcast case or temperature variations in the 
cable case. Following the lead of the workers in VLBI, similar standard 
oscillators have been set up at the transmitter and receiver. These 
oscillators have sufficient short-term (<1 s) and long-term (<1 
degree/day) stability to enable measurements to be made within the 
desired accuracy (0.03 degree rms for 6.6-MHz tone separation). 


IV. CHARACTERISTICS OF FADES 


This section describes the temporal behavior of the signal during 
fading. The data were recorded on analog FM tapes continuously from 
September 11 to December 31, 1970. The system was shut down on 
two occasions for several days for servicing. The system was also 
turned down a few times for several hours at a time for making tests. 
The tape was manually changed as it approached the end. Each tape 
lasted an average of five days. The tape was turned on automatically 
whenever any one tone exceeded 10-dB fade and ran till all the four 
tones recovered from deeper than 10-dB fade. The nature of the fades 
can be classified into the following broad categories. 
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(¢) All the four tones in the 20-MHz band fade simultaneously, 
with the level of fade being approximately the same for all of 
them. In other words, there is no significant frequency selec- 
tivity present. Such fades last several minutes. These fades are 
usually relatively shallow (less than or about 20 dB). 

(21) The four tones fade selectively; that is, the fade levels of the 
four tones are significantly different. Such selective fades are 
usually deep (greater than about 20 dB) and last no more than 
a few seconds and often much less than one second. The disper- 
sive fades are usually superimposed on shallow, nondispersive 
ones. The deepest fade level on the four tones may occur either 
simultaneously or separated in time by a small fraction of a 
second. In the former case the minimum amplitude in the band 
occurs at a single frequency in the band, whereas in the latter 
case the minimum traverses across the band in time. 


There is no significant phase nonlinearity during the nondispersive 
fades mentioned in case (2) above. A temporal presentation of the 
amplitudes of the four tones and phase differences between them 
during a typical nondispersive fade is given in Fig. 4. For the sake of 
clarity, the traces are displaced from one another. The fade level of 
A, is with respect to its nominal unfaded value. ‘The nominal unfaded 
levels of As, As, and A, are displaced by approximately 20, 40, and 
60 dB, respectively. Similarly, the three phase difference traces are 
displaced from each other by about 50 degrees. The fade duration is 
about 1 minute, and the maximum fade depth is about 15 dB. There 
is no noticeable variation in the magnitudes of the difference between 
adjacent tones. Although the traces of the phase differences in Fig. 4 
are nearly horizontal, they have, in general, a linear slope with time 
which is the same for all three. As mentioned in Section II, this is 
caused by the difference in the quartz oscillator frequencies between 
the one driving the picket-fence generator at the transmitter and that 
of the other in the phase/amplitude measuring system at the receiver. 
Any path length change is reflected on the three traces as a change 
from the linear slope and will be the same in all the three traces. 
However, when there is a phase nonlinearity present within the 
20-MHz band, the three traces will have different slopes. 

Figure 5 represents a dispersive fade of the type discussed in case 
(72) above. The total duration of the fade is in excess of 5 minutes, 
and the maximum fade depth is 39 dB (on Az). Once again, the time 
axis has been referenced with respect to that at which the deepest 
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Fig. 4—Nondispersive fading. 


fade occurs. It could be observed in this particular event that there are 
two deep fades separated by about 2 minutes superimposed on the 
prolonged shallow fades. It is further observed from the phase data 
that significant phase nonlinearity is present only during the deep 
fade periods that last only in the order of seconds. The magnitude of 
the phase nonlinearity is dependent on the depth of the fade and its 
amplitude dispersion. Thus, while the first fade in Fig. 5 exhibits 
significant phase nonlinearity, the second does not. 

A time-expanded representation of the first fade in Fig. 5 is given 
in Fig. 6. Large selective fading is exhibited during this event for an 
extended period of time. During the few seconds around the deepest 
fade period, the fade dispersion curve changes slope and the high- 
frequency end of the band (A4) fades deeper than the rest. The phase 
nonlinearity is exhibited predominantly during this period of deepest 
fade. Figure 6 has been redrawn in Fig. 7 by interchanging the role of 
time and frequency parameters. The dispersion with respect to fre- 
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Fig. 5—Dispersive fading. 


quency of the fade depth and the phase difference are plotted as discrete 
functions of time. This presentation visually portrays the nonlinearities 
as greatest around zero time. We can also observe that the phase 
difference curve changes from a convex to a concave shape between 0 
and 0.2 second. Thus, the phase nonlinearity could assume zero value 
around the time of deepest fade. 

The relationship between the amplitude distortion and phase 
nonlinearity was explored by fitting a second-order polynomial curve 
over the log-amplitude and phase dispersion curves and then corre- 
lating the coefficients of the quadratic terms. 

The following second-order polynomial expansion was assumed for 
log-amplitude dispersion X(f) (in dB) and phase dispersion ¢(f) (in 
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Fig. 6—Dispersive fading. 


degrees). 
X(f) = ay (fs) + Ox(f — fs) + &(F — fs)? (11) 


b(f) = as(fs) + bg(f — fs) + co (f — fs)? (12) 


The temporal behavior of the quadratic log-amplitude and phase 
nonlinear coefficients designated by c, and cy, respectively, are given 
in Fig. 8. There appears to be some degree of correlation between 
these coefficients except around zero time reference. As explained in 
the description of Fig. 7, the reason for the deviation around zero time 
is the change in the convexity of the phase dispersion curve. 
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Fig. 7—Amplitude and phase dispersion during dispersive fade. 


V. DATA ANALYSIS 


The amplitude and phase dispersion data have been fitted with 
second-order polynomial curves, and the nonlinear coefficients of the 
two have been studied. Specifically, results on the following have been 
obtained. 


(¢) The distribution curve describing the fraction of time that 
the phase nonlinearity exceeds a given value. 

(iz) The average of the magnitude of the phase nonlinearity as a 
function of the depth of the fade. 
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Fig. 8—Temporal behavior of log-amplitude and phase nonlinearities (c,, Cg). 


(zat) The nature of the correlation between log-amplitude and phase 
nonlinear coefficients. 


Letting f; = 0 and f — fs; = Afin eqs. (11) and (12), we obtain 


X(Af) = ay(0) + b,(Af) + ¢, (Af)? (13) 
b(Af) = ag(0) + bs (Af) + cg (Af)?. (14) 
The data were taken on four tones, at Af = — 13.2 MHz, —6.6 MHz, 


0, and +6.6 MHz. For the sake of convenience in the analysis, the 
frequency separation was normalized with respect to Af = 13.2 MHz. 
Thus, the four data points corresponding to eq. (14) were designated 
by the following set of simultaneous equations. 


o(—1) = G(0) + 64(—1) + &(—1)? 

o(—4) = G(0) + b4(—4) + &(—-4)" 
¢(0) = (0) 

o(+4) = G(0) + b6(3) + Z(+3)*. 


(15) 
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Comparing the coefficients of eqs. (14) and (15), we obtain 


ag = ag 





s 1 
bs = bs (sxe) 


2 1 ; 
teats ( 2X 6.6 X 10° ) ot 


A similar set of equations can be obtained for the log-amplitude 
dispersion, yielding 


ay, = ay 
- 1 
bn = (saps) 


~ 1 2 
&=% (sxereie) : Ste 


The measured log-amplitude and phase dispersion data of each event 
was digitized at a sampling period of 0.2 second. Equations (15) con- 
tain three unknowns (a, b,, and @s) and four equations. The four data 
points for ¢ were obtained by arbitrarily setting ¢1 = 0 and then 
computing ¢2, ¢3, and ¢4 by the measured differences. These four data 
points were then fitted with a smooth curve with the least-square-fit 
criterion, and cs was computed for the fitted curve. Note that this 
procedure does not affect the computational result on cy. Further, it 
will not cause any serious error as discussed by Babler,! as in the case 
of fitting a third-order polynomial curve that passes through all four 
data points. The log-amplitude coefficient c, was obtained by fitting 
a second-order polynomial curve through the four data points, X1, Xe, 
X3, and X,, which are directly available from measurements. The 
coefficients cy and c, were then computed using eqs. (16) and (17). 

The quadratic phase nonlinearity can easily be related to the more 
familiar parameter of delay distortion. The delay distortion, denoted 
by 7(w), is defined as the departure of the envelope delay, D(w), from 
a constant value.* The envelope delay is given by® 


Dw) = @. (18) 


From eqs. (12) and (18), we obtain 


Cy 


r(dw) = 180 x 10* (Af), (19) 
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where 7 (Aw) is the delay distortion in s/ MHz, cy is expressed in degrees/ 
(MHz)?, and Af is expressed in MHz. 


VI. RESULTS 


Numerous fading events were recorded exceeding 10-dB fade. Of 
these, 26 events faded deeper than 20 dB and were selected for data 
analyses. This selection was based on the fact that only these display 
phase distortion and consequently are of interest here. The first 
selected event occurred during the period of September 16 to 18 
(recorded in the tape that ran during this period). The twenty-sixth 
event occurred during the period of November 20 to 23. No event 
exceeded 20-dB fade between November 23 and December 31. Babler! 
has observed, in the amplitude dispersion experiment run during 
approximately the same period in the same microwave link, 40 counts 
of the tone at the center of the band dipping below 20 dB. In the 
present analysis, the interval of an event is defined beginning from 
when at least one tone exceeds a fade depth of 10 dB and lasting until 
all the tones have recovered above 10-dB fade level. Thus, an event 
of ours could include more than one count of Babler’s experiment. 
Considering the loss of time because of system shut-downs and tape 
run-outs mentioned in the previous section, the number of 20-dB 
fade events in the two experiments agree with each other satisfac- 
torily. Note that the number of fading events during the 1970 autumn 
season appears to be significantly below normal. Unfortunately, not 
all the events could be used in the analyses because of failure in part 
of the instrumentation. Data of 16 events were used in connection 
with the analysis of phase nonlinearity [items (7) and (zz) mentioned 
in the previous section], and the correlation between log-amplitude 
and phase nonlinearities [item (727) ] was made for 14 events. 


6.1 Distribution of Phase Nonlinearity 


The distribution of the phase nonlinear coefficient cg for the 
pooled data of 16 events (consisting of 2284 samples) is shown in 
Fig. 9. Three distribution curves represent the positive, the negative, 
and the absolute values of cy. Observe in Fig. 9 that 70 percent of the 
samples yielded positive values for c, and the remaining 30 percent 
were negative. No attempt was made to study the distributions of the 
individual events, since the sample size was considered inadequate. 
The distribution of |c,| is presented on a log-normal graph in Irig. 10. 
Observe that the smooth curve fitted over the data points is ‘close’ 
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Fig. 9—Distribution curves for cg. 


to a straight line. Hence, it appears that the distribution of the phase 
nonlinearity is close to log-normal. This conclusion is substantiated 
more quantitatively in the appendix. 


6.2 Dependence of Phase Nonlinearity on Fade Depth 


Figure 11 shows the dependence of phase nonlinearity on the depth 
of fade. The data points represent the magnitude of quadratic non- 
linear coefficient |c,| as each event reached the fade depths of 10, 15, 
20, 25, 30, 35, and 40 dB as well as when they come out of the fade. 
This approach of plotting the data points was adopted over that of 
recording the value of |c,| at the maximum level of the fade for each 
event for the following two reasons. First, the sample size is larger. 
Second, and more important, as explained in Section IV, |c,| could 
momentarily assume a zero value at the peak of the fade because of 
its changing sign, and thus could Jead to an erroneous result. The 
smooth curve represents the average value of |c,| as a function of the 
fade depth. We see that |c,| increases with fade depth beyond 20-dB 
fade. The magnitude of |c,| remains constant at 0.02 degree/ (MHz)? 
below 20 dB, which is due to the limitation in the accuracy of equip- 
ment and the variance of |cg| in curve fitting. (|cg| should eventually 
go to zero at 0-dB fade.) 
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Fig. 10—Distribution curve for |{cy]|: log-normal plot. 


Also presented in the ordinate scale is the delay distortion 7(Aw) 
in eq. (19), in seconds over a 1-MHz band. Observe that corresponding 
to |cg| = 0.1 degree/(MHz)? and Aw = 27 the delay distortion is 
calculated to be 0.55 nanosecond, and this occurs at a fade depth of 
about 34 dB. This reasoning does not take into account the phase 
dispersion ripples being present between the tones (i.e., frequency 
separation of less than 6.6 MHz). Babler! and Ho’ have reported the 
presence of such ripples (some of them of significant magnitude) 
superposed on an overall smooth amplitude dispersion curve in a 
20-MHz band. This fine structure is currently being investigated in a 
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Fig. 11—Dependence of phase nonlinearity, cy, on fade depth. 


more comprehensive study being conducted at Bell Laboratories by 
choosing tones that are closer than 6.6-MHz separation. 


6.3 Correlation Between Log-Amplitude and Phase Nonlinearities 


The correlation between the log-amplitude nonlinearity ¢, and phase 
nonlinearity €, was computed for each event. Table I shows the results 
for 14 events. The correlation coefficient was calculated for each event 
using digitized data points for that part of the fade below the 10-, 15-, 
20-, and 25-dB levels in order to establish the dependence of the corre- 
lation coefficient R on fade level. The results shown in Table I do not 
easily lend themselves toward making any firm conclusions about the 
physical model that would yield such data, yet they are presented here 
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TABLE I—CoRRELATION BETWEEN LOG-AMPLITUDE 
AND PHASE NONLINEARITIES 











REx; &) 
Event 

10 dB 15 dB 20 dB 25 dB 
70-3-1 +0.58 +0.58 +0.68 +-0.77 
70-3-2 —0.38 — 0.40 —0.71 — 
70-6-1 +0.80 +0.80 +0.60 +0.36 
70-6-2 +0.68 +0.68 +0.70 +0.70 
70-6-4 +0.50 +0.43 — — 
70-6-5 +0.49 +0.49 —0.70 — 
70-6-6 +0.36 +0.39 +0.45 —0.48 
70-6-8 +0.68 +0.93 — — 
70-13-1 —0.74 —0.74 — — 
70-16-1 — 0.65 — 0.67 — 0.70 —0.71 
70-16-2B —0.39 — 0.62 —0.74 — 
70-18-1A —0.65 — 0.64 —0.71 —0.72 
70-18-1B —0.97 —0.97 —0.98 —0.99 
70-23-1 —0.42 —0.42 —0.48 —0.48 


to give an idea of the complexity of the problem. The following 
observations can be made on the results. 


(t) The positive and negative coefficients are equally probable. 

(it) The magnitude of the correlation coefficient, in general, 
increases as the fade-depth increases. 

(iz) The correlation coefficient, in some cases, changes sign near 
the peak of the fading. 

(tv) It appears difficult to predict the behavior of the phase non- 
linear coefficient from a knowledge of the amplitude nonlinear 
coefficient. One primary aim in pursuing this course of analysis 
was to investigate whether a deterministic relationship could 
be established between the two coefficients. If this were 
possible, it would have helped a communication system 
designer to build an automatic phase compensator that could 
be controlled by measuring the amplitude dispersion, measure- 
ment of the amplitude dispersion being far simpler than that of 
the phase dispersion. 


VII. SUMMARY 


Measurement of phase and amplitude dispersion have been made 
over a 20-MHz band at 6 GHz on a 42-km line-of-sight radio link. 
The present experimental technique is unique in that the direct phase 
dispersion, instead of delay distortion, has been measured using the 
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scheme of very-long-base interferometry. The base delay between 
adjacent pairs of four tones equally spaced over a 19.8-MHz band was 
measured. Twenty-six events were recorded during the autumn 
season. of 1970 that produced phase distortion of greater than 0.02 
degree/(MHz)?. All these events had fades which exceeded 20 dB 
below the nominal level. Although the number of events observed 
appears to be considerably less than normal, it was considered adequate 
to derive some preliminary statistics about the phase dispersion charac- 
teristics during deep dispersive microwave fading. 

The quadratic phase nonlinear coefficient which is linearly propor- 
tional to the delay distortion is observed to have a log-normal distri- 
bution. To the best of the authors’ knowledge, this is the first time that 
statistics have been obtained on such phase characteristics. These 
results may be useful in the formulation of a statistical model of 
microwave fading. 

The quadratic phase nonlinear coefficient and hence the delay 
distortion increase with the depth of fade. On the average, fades 
deeper than 34 dB below nominal level cause a delay distortion in 
excess of 0.55 nanosecond/MHz. For a monochrome television signal 
with about 4-MHz bandwidth, the delay distortion permitted is about 
25 nanoseconds.® If the same bandwidth is assumed at the radio 
frequency band as would be if it were an amplitude-modulated system, 
the average delay distortion for a 40-dB fade would be, from Fig. 11, 
about 4 nanoseconds, which would be well within the tolerance. Over 
the measured 20-MHz band, the delay distortion at 40-dB fade would 
be about 20 nanoseconds. 

During the deep dispersive fades, the nonlinear phase and amplitude 
coefficients were found to have nonzero correlation which could be 
either positive or negative. No simple relationship seems apparent 
between the two coefficients. Although simple two-ray models can be 
made to account for both amplitude and phase dispersion,’ the com- 
plex temporal behavior indicated in our results along with that re- 
ported by Babler’ lead us to believe that the multiray (more than two 
rays) phenomenon is the cause of these deep fades. 
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APPENDIX 


To determine the nature of the distribution curve for the absolute 
value of cg, an entirely empirical approach to the problem was adopted. 
Plotting the raw data of the phase nonlinear coefficient [Z| on a 
log-normal graph paper indicated that the distribution was close to 
log-normal. In the literature’ a powerful technique exists which 
transforms, in most cases, a given distribution to a normal distribution 
by a suitable linear transformation. Use of this technique was ex- 
pected to yield information on how closely the observed set of data 
approximated a log-normal distribution. Following Box and Cox® who 
have treated such an analysis of transformation, the following trans- 
formation was chosen. 


a 
2 DS se @) 


Ct (A) = (20) 
logé (A = 0). 


Here, is the parameter that defines the transformation. (For the 
sake of convenience, the absolute value signs have been omitted.) In 
accordance with our assumption, a value exists for \ such that é:(A) 
is normally distributed. The value of \ can be determined using the 
maximum likelihood theory. The method of maximum likelihood 
involves maximizing the log-likelihood estimate Lnax with respect to 
the unknown parameters of », o?, and \, where uw and o? are the mean 
and variance of the transformed data @,,(A). The maximum likelihood 
estimate of \, denoted by i, yields the best possible estimate that 
would make the transformed variable ¢,; closest to a normal distri- 
bution. The maximum log-likelihood function is given by 


Lmax(A/C1g, C29, are Eng) SP 5 log o he (r am 1) 2, logcie, (21) 


where n is the number of data points. 

The value of \ varied between —1 and +1 in steps of 0.1. The 
maximum value of Lmax(A) as a function of » occurs at A = — 0.1. 
Thus, the maximum likelihood estimate for \, \ is —0.1. Figure 12 
shows the data points of Cy:({) on a log-normal graph. The straight 
line through the data points corresponds to \ = 0. It is seen from 
eq. (20) that \ = 0 yields a log-normal distribution. Except at the 
large value of nonlinearity, the distribution appears log-normal. This 
is further justified by the fact that } = — 0.1 is statistically quite 
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Fig. 12—Distribution of €4:(X): log-normal plot: X = —0.1. 


close to zero. The deviation at the high end can be largely attributed 
to poor sample size. 
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Perturbation Calculations of Rain-Induced Differential 
Attenuation and Differential Phase Shift at Microwave 
Frequencies 


By J. A. MORRISON and T. S. CHU 
(Manuscript received August 24, 1973) 


In a recent note! calculated results of differential attenuation and 
differential phase shift, as a function of rain rate, were given at fre- 
quencies of 4, 18.1, and 30 GHz. The calculations have since been done 
at 11 GHz also. These results are based on scattering of a plane elec- 
tromagnetic wave by oblate spheroidal raindrops. The point matching 
procedure used to obtain nonperturbative solutions to the problem was 
briefly described, and full details will be presented later.2 Somewhat 
similar calculations have been carried out by Oguchi® at 19.3 and 
34.8 GHz. 

The purpose of this note is to point out that a modification of 
Oguchi’s earlier first-order perturbation approximation,‘ for spheroidal 
raindrops with small eccentricity, gives results which are quite close 
to those obtained by the point matching procedure. We also give these 
modified perturbation results at frequencies in the range up to 100 
GHz, although they may be less reliable at the higher frequencies, par- 
ticularly at the heavier rain rates.4 We remark that the perturbation 
results are obtained quite inexpensively, whereas the point matching 
procedure is very costly. 

The surface of an oblate spheroidal raindrop is given in spherical 
coordinates by 


r = R(6) = a(1 — vsin? 6)-? = afl + $vsin? 6 + O(r?)], (1) 


that the ratio of minor to major axis depends linearly on the radius 
1907 


for0 S @ S za, independently of the azimuthal angle ¢. It was assumed! 
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@ (in cm) of the equivolumic spherical drop; specifically a/b = (1 — 4). 
Thus, from (1), a = &(1 — @)* and v = G(2 — 4@). We may rewrite (1) 
in the form 

R(6) = a[l + 2a(3 sin? 6 — 3) + 0(@)], (2a) 
or 


R(6) = aL + v(g sin? 6 — 3) + OC’) ]. (2b) 


Then, rather than perturbing about a spherical drop of radius a, with 
perturbation parameter v, as did Oguchi,* we perturb about the equi- 
volumic spherical drop of radius @, and take either v or 2a as the per- 
turbation parameter. 


8 X POINT MATCHING METHOD 
—— PERTURBATION ABOUT AN 


6 EQUIVOLUMIC 
SPHERE WITH pb = 24 

4 PERTURBATION ABOUT AN 
EQUIVOLUMIC 


SPHERE WITH v = 4a (2-a) 
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Fig. 1—Rain-induced differential attenuation. 
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Fig. 2—Rain-induced differential phase shift. 


Oguchi’s first-order perturbation results have been generalized to 
axisymmetric raindrops which are nearly spherical, as will be discussed 
in the detailed paper.? There the first-order approximations to the 
forward scattering functions S;(0) and S11(0), for horizontally disposed 
oblate spheroidal raindrops, will be compared to the values obtained 
by the point matching method, for the 14 different drop sizes d = 0.025 
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Fig. 3—Comparison between point matching and three perturbation methods for 
differential attenuation. 


(0.025) 0.35. The subscripts I and II correspond to vertical and hori- 
zontal polarizations of the incident electric field, respectively. Here we 
consider only the differential attenuation and differential phase shift, 
which are obtained! by summing the real and imaginary parts of 
S31(0)-S1(0) over the Laws and Parsons drop size distribution.’ We 
comment that for the larger drop sizes the perturbation parameter is 
not small. 

Although extra first-order correction terms arise in the expansions 
about the equivolumic spherical drop, given in (2), they correspond to 
a constant change in the radius of the drop. Hence the corresponding 
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Fig. 4a—Comparison between point matching and three perturbation methods 
for differential phase shift at 11 GHz. 
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Fig. 4b—Comparison between point matching and three perturbation methods for 
differential phase shift at 30 GHz. 


increments in the forward scattering functions are the same for both 
polarizations, and therefore do not affect the difference S3:(0)—S1(0). 
Thus Oguchi’s formulas‘ may be applied directly to calculate this 
difference, by replacing a by 4, and using either v, or » = 24, as the 
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perturbation parameter. We remark, however, that some simplifica- 
tions may be made? in the four expressions given in equation (37) of 
Oguchi’s 1960 paper. 

Using the approximate forward scattering functions from the per- 
turbations about an equivolumic sphere, we obtained the differential 
attenuation and differential phase shift versus frequency for various 
rain rates as shown in I’igs. 1 and 2. The refractive indexes of water at 
20°C were obtained as described in the previous note.! The curves are 
calculated with the perturbation parameter » = 24, while the dots are 
calculated with the perturbation parameter »v. The point matching 
solutions are included as crosses; these show good agreement with the 
curves, whereas the dots deviate more from the crosses. (In order to 
avoid confusion, we have omitted the dots and crosses corresponding to 
the differential phase shift at 30 GHz.) On the other hand, we found 
much greater discrepancy between the point matching results and the 
approximate results from the perturbations about an inscribed sphere. 
This discrepancy is illustrated in Vigs. 3, 4a, and 4b for 11 and 30 GHz. 
Discrepancies for 4 and 18.1 GHz are similar to those for 11 GHz. The 
above comparison between the point matching solution and three 
perturbation solutions is consistent with the order of geometrical 
errors in the three approximations to the oblate spheroid, i.e., the 
largest error corresponds to (1) and the smallest error corresponds 
to (2a). 

The differential attenuation and differential phase shift recently pre- 
sented by Watson and Arbabi® from 4 through 36 GHz are based upon 
Oguchi’s perturbation solution. These numerical values are in general 
considerably lower than those of the point matching solution, except 
for the differential phase shift around 30 GHz, where the differential 
phase shift from the point matching method decreases sharply. The 
differential phase shift becomes negative at millimeter wavelengths, 
and hence remains a significant factor in depolarization. 

The authors are indebted to Susan Hoffberg who wrote the programs 
for calculating the first-order perturbation approximations, and to 
Diane Vitello who performed the summation over the drop size 
distribution. 
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Erratum 


“A Theory of Traffic-Measurement Errors for Loss Systems With 
Renewal Input,” B.S.T.J., Vol. 52, No. 6, July-August 1973, pp. 
967-990, by S. R. Neal and A. Kuczura. 

A transcription error occurred in eq. (18). It should be 


2 c 
ELKiX1] = wile) + es u1(C) >» i(k) + De — DO, (18) 


and was derived in Ref. 4. This is a transcription error only and does 
not alter the numerical results or the conclusions (8.R.N. and A.K.). 
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